ODBC and MFC
Michael Patterson -- patterso@sprynet.com Wednesday, November 20, 1996 Environment: VC++ 1.52, Win 3.1, 16-bit Hello, I have two MFC related - ODBC questions: [Q1] I have a question about the fastest way for bulk inserts into an Access datasource. I think/assume the MFC class, CRecordset, repeatedly calls 'SQLExecute' during the AddNew/Update process. Doing that for a couple of hundred records, seems to take a lot of time. My assumption, until rollback or commit: { ::SQLPrepare(hstmt, (UCHAR FAR*)"INSERT INTO MY_DATASOURCE VALUES (?)", SQL_NTS); // ::SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT, SQL_C_SSHORT, SQL_SMALLINT, 5, 0, &iMyData, 0, &cbMyData); // while (TRUE) { iMyData = iMyData + 1; ::SQLExecute(hstmt); } } I do know about the ODBC function 'SQLParamOptions', which will allow you to do bulk inserts. But, the driver that I am using does not support that function. [I don't know how much faster bulk inserts would be using that function.] My record size for that data source is 20 bytes. And like I said before, it's just taking too long to insert records. Any help would great. [Q2] I also have one other question. How do I enable my MFC application to use the ODBC 'Setup' & 'Installer' DLL functions? Thanks. Mike
wei@cmgroup.com Friday, November 22, 1996 [Mini-digest: 2 responses] > [Q1] I have a question about the fastest way > for bulk inserts into an Access datasource. > I think/assume the MFC class, CRecordset, repeatedly calls > 'SQLExecute' during the AddNew/Update process. Doing that > for a couple of hundred records, seems to take a lot of time. The speed problem may not necessary because of calling SQLExecute multiple times. Access is file based DB. Every insert involve disk access. The speed was controled by disk access time, as we know, it's slow. I'm not quite familiar with Access. If it does support transaction, use transaction will speed up a lot even you call AddNew/Update many times the disk access only once (assume you're not insert a huge amount of data at a time). -----From: hrubin@nyx.net (Howard Rubin) Michael Patterson said: [Q1] I have a question about the fastest way for bulk inserts into an Access datasource. According to MSDN, the fastest way to insert into an Access database is to use SQLSetPos(). If you have any questions after reading the MSDN article, and I certainly did, email me at work (hrubin@disc.com) and I'll send you a working sample app. that I wrote after several calls to Microsoft Tech support ODBC group. - Howard Rubin
Mike Blaszczak -- mikeblas@nwlink.com Sunday, November 24, 1996 At 18:57 11/22/96 -0500, wei@cmgroup.com wrote: >> [Q1] I have a question about the fastest way >> for bulk inserts into an Access datasource. > >> I think/assume the MFC class, CRecordset, repeatedly calls >> 'SQLExecute' during the AddNew/Update process. Doing that >> for a couple of hundred records, seems to take a lot of time. > >The speed problem may not necessary because of calling SQLExecute multiple >times. Access is file based DB. Every insert involve disk access. This isn't true. Access uses the Jet engine, and the Jet engine aggressively caches and spools database accesses--inside of transactions or not. >The speed was controled by disk access time, as we know, >it's slow. Speed of an applicatio nis really contorlled by the algorithm used to implement the application's behaviour. Implementing questionable code kills your performance; implementing smooth code raises your performance. These aspects of what you write govern your application's performance with far more weight than the database platform you use or which compiler options you switch on or off. >I'm not quite familiar with Access. If it does support transaction, >use transaction will speed up a lot even you call AddNew/Update >many times the disk access only once (assume you're not insert >a huge amount of data at a time). There is not a one-ot-one relationship between disk writes in Access and the use of transactions in Access. Access might do a disk hit inside of a transaction or outside of a transaction. Access might read or write at any time while it is working on your query. .B ekiM http://www.nwlink.com/~mikeblas/ I'm afraid I've become some sort of speed freak. These words are my own. I do not speak on behalf of Microsoft.
Michael Patterson -- patterso@sprynet.com Monday, November 25, 1996 At 09:38 AM 11/24/96 -0800, Mike Blaszczakwrote: >At 18:57 11/22/96 -0500, wei@cmgroup.com wrote: > >>> [Q1] I have a question about the fastest way >>> for bulk inserts into an Access datasource. >> >>> I think/assume the MFC class, CRecordset, repeatedly calls >>> 'SQLExecute' during the AddNew/Update process. Doing that >>> for a couple of hundred records, seems to take a lot of time. >> >>The speed problem may not necessary because of calling SQLExecute multiple >>times. Access is file based DB. Every insert involve disk access. > >This isn't true. Access uses the Jet engine, and the Jet engine >aggressively caches and spools database accesses--inside of >transactions or not. {{{ In my case, are you absolutely certain that each time I did a SQLExecute, there wasn't a disk access? It appeared to be doing that. After the suggestions from Mr. Rubin and Mr. Sheng, I set the driver to manual commit mode, used SQLSetPos, and then finally committed or rolled back the transaction - and it increased the speed dramatically. By definition, when the driver is in auto-commit mode, isn't each statement "committed immediately" after it is executed? If I understand it correctly, Mr. Sheng was correct - at least in my case. >>The speed was controled by disk access time, as we know, >>it's slow. > >Speed of an applicatio nis really contorlled by the algorithm used to >implement the application's behaviour. Implementing questionable code >kills your performance; implementing smooth code raises your performance. >These aspects of what you write govern your application's performance >with far more weight than the database platform you use or which compiler >options you switch on or off. {{{ If your defintion of algorithm, includes not using the 'best fit' combination of ODBC function, you're absolutely right. That's why I posted the question. After correctly implementing the ODBC functions, I noticed a dramatic improvement [even much better than I expected]. >>I'm not quite familiar with Access. If it does support transaction, >>use transaction will speed up a lot even you call AddNew/Update >>many times the disk access only once (assume you're not insert >>a huge amount of data at a time). > >There is not a one-ot-one relationship between disk writes in Access >and the use of transactions in Access. Access might do a disk hit >inside of a transaction or outside of a transaction. Access might >read or write at any time while it is working on your query. {{{ I still need to learn a LOT about ODBC, but my error was in not using the proper ODBC functions. I also left the driver in auto-commit mode which I assume made each statement to be committed immediately after it is executed. Mr. Rubin sent me the following: ************************************************************************** >From Q126131: Insert, Delete, and Update Performance -------------------------------------- In general, Inserts, Deletes, and Updates can be done in one of six ways, as listed below in order of increasing performance. To make the comparison more concrete, 100 inserts were done using each of these methods into an Microsoft Access version 2.0 table that had five columns of text data. NOTE: These results do not represent exhaustive benchmark testing, so they should not be treated as such. They are provided to illustrate the relative performance of each of these methods. 1. 100 SQLExecDirect() inserts with no transaction 5457ms 2. 100 SQLExecDirect() inserts with transaction 4756ms 3. 100 SQLPrepare()/SQLExecute() inserts no transaction 3515ms 4. 100 SQLPrepare()/SQLExecute() inserts transaction 2994ms 5. 100 SQLSetPos()/SQL_ADD inserts with no transaction 831ms 6. 100 SQLSetPos()/SQL_ADD inserts with transaction 721ms **************************************************************************** Can I assume that using a transaction is faster, when doing bulk 'inserts'?
Nick Irias -- cabago@netcom.com Tuesday, November 26, 1996 [Mini-digest: 2 responses] If you are trying to learn more about ODBC, one option is to use DrDeeBee (I dont work for them), a utility that is shipped with Intersolv ODBC drivers. It will produce a log file of all ODBC calls w/ arguments and return codes. It is a usefull debugging tool and is worth the cost of a single Intersolv ODBC driver license even if you dont need the ODBC driver. I dont know if it is available separately. I have attached a sample log of a connect/ interrogate driver/ disconnect session as an example ( I edited the log to hide the userid and password). The log file is a blow by blow of every odbc call that my appI made. SQLAllocEnv 0x01000000 SQL_SUCCESS SQLAllocConnect 0x01000000 0x01010000 SQL_SUCCESS SQLGetInfo 0x01010000 SQL_DRIVER_ODBC_VER [5]02.50 6 5 SQL_SUCCESS SQLDriverConnect 0x01010000 0x00000000 [37]DSN=paleo7;SRVR=vm7;UID=*****;PWD=******* SQL_NTS [37]DSN=paleo7;SRVR=vm7;UID=*****;PWD=******* 255 37 SQL_DRIVER_NOPROMPT SQL_SUCCESS SQLError NULL 0x01010000 NULL [5]00000 0 [0] 512 0 SQL_NO_DATA_FOUND SQLGetFunctions 0x01010000 0 FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE SQL_SUCCESS SQLGetInfo 0x01010000 SQL_CURSOR_COMMIT_BEHAVIOR SQL_CB_PRESERVE 2 NULL SQL_SUCCESS SQLGetInfo 0x01010000 SQL_CURSOR_ROLLBACK_BEHAVIOR SQL_CB_PRESERVE 2 NULL SQL_SUCCESS SQLError NULL 0x01010000 NULL [5]00000 0 [0] 511 0 SQL_NO_DATA_FOUND SQLGetInfo 0x01010000 SQL_DATA_SOURCE_NAME [6]paleo7 255 6 SQL_SUCCESS SQLError NULL 0x01010000 NULL [5]00000 0 [0] 511 0 SQL_NO_DATA_FOUND SQLGetInfo 0x01010000 SQL_DRIVER_NAME [11]IVOR709.DLL 255 11 SQL_SUCCESS SQLError NULL 0x01010000 NULL [5]00000 0 [0] 511 0 SQL_NO_DATA_FOUND SQLGetInfo 0x01010000 SQL_DRIVER_VER [10]02.12.0000 255 10 SQL_SUCCESS SQLError NULL 0x01010000 NULL [5]00000 0 [0] 511 0 SQL_NO_DATA_FOUND SQLError NULL 0x01010000 NULL [5]00000 0 [0] 511 0 SQL_NO_DATA_FOUND SQLGetInfo 0x01010000 SQL_DBMS_NAME [6]Oracle 255 6 SQL_SUCCESS SQLError NULL 0x01010000 NULL [5]00000 0 [0] 511 0 SQL_NO_DATA_FOUND SQLGetInfo 0x01010000 SQL_DBMS_VER [10]07.00.0000 255 10 SQL_SUCCESS SQLError NULL 0x01010000 NULL [5]00000 0 [0] 511 0 SQL_NO_DATA_FOUND SQLGetInfo 0x01010000 SQL_SERVER_NAME [3]vm7 255 3 SQL_SUCCESS SQLError NULL 0x01010000 NULL [5]00000 0 [0] 511 0 SQL_NO_DATA_FOUND SQLGetInfo 0x01010000 SQL_DATABASE_NAME [0] 255 0 SQL_SUCCESS SQLError NULL 0x01010000 NULL [5]00000 0 [0] 511 0 SQL_NO_DATA_FOUND SQLGetInfo 0x01010000 SQL_USER_NAME [3]CPS 255 3 SQL_SUCCESS SQLError NULL 0x01010000 NULL [5]00000 0 [0] 511 0 SQL_NO_DATA_FOUND SQLDisconnect 0x01010000 SQL_SUCCESS SQLFreeConnect 0x01010000 SQL_SUCCESS SQLFreeEnv 0x01000000 SQL_SUCCESS -----From: Mike BlaszczakAt 18:51 11/25/96 -0800, Michael Patterson wrote: >>>The speed problem may not necessary because of calling SQLExecute multiple >>>times. Access is file based DB. Every insert involve disk access. >> >>This isn't true. Access uses the Jet engine, and the Jet engine >>aggressively caches and spools database accesses--inside of >>transactions or not. > >{{{ >In my case, are you absolutely certain that each time I did a SQLExecute, >there wasn't a disk access? Nope. I made no such claim: I was refuting the generalization that "every insert [into an Access DB] involves disk access". That's simply not true, and it really doesn't even have much to do with transactions. I don't think you posted enough information about your situation for anyone to draw any conclusion about when your drive light was coming on, and I certainly didn't do that. >By definition, >when the driver is in auto-commit mode, isn't each statement "committed >immediately" after it is executed? Yes, it is. But "committed" doesn't necessarily mean it has been written to disk. It means that you can't roll back the transaction anymore. It means that the locks that you grew during the transaction have been released. >Can I assume that using a transaction is faster, when doing bulk 'inserts'? That really depends on what, exactly, you mean by "using a transaction". First, a database transaction has no relationship with physical I/O. There are situations where ending a transaction won't cause writes. The most obvious is the use of a read-only transaction: if you do this: SELECT something; // the databsae locks what you've selected // sit and think about it COMMIT TRANSACTION; // the database releases your locks There are no disk writes. Second, it's completely up to the database what relationships there might be between a transaction involving particular SQL and physical I/O. That relationship might further be complicated by the underlying operating system. IF you want to insert 10 rows to a table, you might do this ten times: INSERT ROW #1; // the server acquires locks on the data and // index pages. // the server creates a transaction log. // the server processes your insert. (*) // the server makes a note of what // changed in its transaction log. COMMIT TRANSACTION; // the server deletes the transaction log. // the server releases the locks. // the server finalizes the insert. (+) INSERT ROW #2; // the server acquires locks on the data // and index pages. // the server creates a transaction log. // and so on You can see that, with this repetition, the DBMS has to do lots and lots of stuff over and over again. Opening the log, acquiring and releasing locks, and releasing the log is an expensive proposition. The server might do a disk hit on _any_ of these lines: large database objects might spill their locking tables out of memory and onto disk. If that happens, just testing to acquire a lock can cause a disk hit. The line marked (*) may or may not alter the underlying table. It is very likely that it will just dork with something in memory, and later use that information in memory when it starts stuffing the data to disk, on the line marked (+). For some DBMSs, even the line marked (+) might only let the data sit around in memory to be spooled off to disk by some other thread, assuming nobody else deletes that same row before it gets written, say. If you insert 10 rows to a table, a better way might be to do this: INSERT ROW #1; // the server acquires locks on the // data and index pages. // the server processes your insert. // the server creates a transaction log. // the server makes a note of what changed // in its transaction log. INSERT ROW #2: // the server processes your insert. // the server makes a note of what changed // in its transaction log. // and so on COMMIT TRANSACTION; // the server deletes the transaction log. // the server releases the locks. // the server finalizes the insert. This is clearly faster because you're not going to redundantly do all the log management, you're not worrying about setting and clearing locks quite so frequently. _Maybe_ the server will do disk I/O less frequently, but there is no correlation between committing a transaction and physical I/O. But turning off per-statement auto commits causes the DBMS to do less management, and that's what makes things faster. It _might_ do less disk I/O, but that's not a certainty... even when we can identify the specific DBMS in question. Certainly, avoiding redundant work where you can is something that improves an algorithm. It's important to think about what's going on under the API when you pick out your algorithm. Getting more familiar with ODBC will help you understand these things, but I think it might be better to learn some theory first. "Fundamentals of Database Systems", by Elmasri and Navathe (Benjamin Cummings, 1989, ISBN 0-9053-0145-3) is an excellent textbook on these architectural features of database systems. Coming to understand what this book discusses will give you an excellent understanding of what's going on-- and that'll help you figure out what features of the system to look for when you want to try to find what's so darned slow. .B ekiM http://www.nwlink.com/~mikeblas/ I'm afraid I've become some sort of speed freak. These words are my own. I do not speak on behalf of Microsoft.
| Вернуться в корень Архива |