MySQL, transaction, and alot of records

  1. 4 months ago

    Duane M

    Jul 6 Pre-Release Testers, Xojo Pro Boston, MA
    Edited 4 months ago by Duane M

    I'm trying to understand how to improve the speed with which a few thousand records can be inserted into a MySQL table using 1 transaction. The circumstances are that there are about 10 users, they receive an Excel file with an arbitrary number of columns, they do a little preparation, and save it to a tab delimited file. The first row of the file are the column names and the remainder are data rows. As it is now the data is read one row at a time and inserted using an INSERT statement. Prepared statements can't be used because of the variable columns.

    The above is done using a temporary table. There is never any reason to save this table as permanent. Two more operations are then performed on the temporary table. When these operations are done the table goes away. First some columns are added to the table and next a join is performed to bring data into the new columns. A list box is populated and the user then downloads the data to a tab delimited file. There is no need to keep any of this data in the database so temporary tables work well. They just go away.

    I believe this comes down to how I structure the code to make this work correctly. Here is my pseudo code describing where I'm at.

        db.SQLExecute("START TRANSACTION")
    
        While Not input.EOF   'read the data file
         
          If vLineNumber=1 Then   'this is the column names, read them and create the table
            
            //  CREATE THE TEMPORARY TABLE
            
          Else
            //  ADD THE INDIVIDUAL RECORDS
            // REPEAT THROUGH EACH ROW, READING AND INSERTING
            line = input.ReadLine 'should be line 2 the first time
           
          End If
        Wend
        db.Commit

    In the current structure the transaction begins before the table is created. I don't think that's right. And I'm reading one line at a time. Maybe I need to restructure the file reading to read it twice. Once for the header row, close it, and then open it again and bang through the inserts beginning at row 2.

    Thanks for any suggestions on how to insert a table of data using transaction.

  2. Jean-Yves P

    Jul 6 Pre-Release Testers, Xojo Pro Answer Europe (France, Besancon)

    you must have as low data going from-to the server as possible.
    the trick is to read all the things you need in one sql query, then read the recordset locally.
    it is much much faster to read 1 sqlquery with 1000 records in it than 1000 sql queries with 1 record in the recordset.

    it is the same for the insert statements. make one big string with all the inserts separated with ";" instead of many insert send one at a time.
    this way I can send 1000 records to a postgres server that is on a lousy 1Mb adsl line in less than 5 seconds.

  3. Eli O

    is not verified Jul 6 Europe (Berlin, Germany)

    Why reinvent what already exists? See LOAD DATA INFILE .

  4. Norman P

    Jul 6 Xojo Inc
    Edited 4 months ago by Norman P

    @Eli O Why reinvent what already exists? See LOAD DATA INFILE .

    because he already did try this ?
    https://forum.xojo.com/37052-mysql-load-data-local-infile/p1#

  5. Norman P

    Jul 6 Xojo Inc

    @Duane M I'm trying to understand how to improve the speed with which a few thousand records

    Well I hesitate to suggest this BUT have you tried without a transaction ?
    I prefer using one because then you can be sure that everything gets inserted - or not

    Having the create table in the transaction shouldn't be an issue - it isn't in most db's

  6. Duane M

    Jul 6 Pre-Release Testers, Xojo Pro Boston, MA

    Yes, I have tried without transaction and it's slow. But I think Jean-Yves has pointed me in the right direction. I'll create all the inserts in a string variable, each terminated by a ";", and then create the transaction and do the mass insert using SQLExecute.

  7. Jeff T

    Jul 6 Midlands of England, Europe
    Edited 4 months ago by Jeff T

    On Windows, you can install an ODBC driver that exposes an Excel worksheet as a table.
    No inserts required but record access afterwards is slower

  8. Duane M

    Jul 6 Pre-Release Testers, Xojo Pro Boston, MA

    Unfortunately not all the users are on Windows. Mostly Mac users here.

    Yes, the LOAD DATA INFILE did not work out.

  9. Tim H

    Jul 6 Pre-Release Testers Portland, OR USA

    I would use an in-memory sqlite database instead.

  10. Norman P

    Jul 6 Xojo Inc

    @Tim H I would use an in-memory sqlite database instead.

    that might be tough if the data used to do his joins is in mysql

    @Duane M Two more operations are then performed on the temporary table. When these operations are done the table goes away. First some columns are added to the table and next a join is performed to bring data into the new columns.

    but not impossible as the join could be written to do a select against mysql, the data inserted to a local in memory db and then exported back to the tab delimited file

    It WOULD save moving all this data to mysql over the network just to toss it all away once the tab delimited file is created

  11. Kevin G

    Jul 6 Pre-Release Testers, Xojo Pro Gatesheed, England

    What storage engine are you using?
    Have you tried using the memory storage engine in MySQL?
    Can you provide a dump of the MySQL schema?

  12. Duane M

    Jul 6 Pre-Release Testers, Xojo Pro Boston, MA

    @Jean-YvesPochez make one big string with all the inserts separated with ";" instead of many insert send one at a time.

    This did the trick. For 3000 records it's about 25% of the time. Maybe a little less.

    @Tim H I would use an in-memory sqlite database instead.

    That would be a good idea but the join would make for more work as Norman points out. The join performance isn't bad. This could be a good sell point for an updated version. I'd enjoy trying this but need to get this delivered.

    @Kevin G What storage engine are you using?
    Have you tried using the memory storage engine in MySQL?

    I'm using InnoDB on MySQL 5.7.x. I haven't tried the memory storage engine.

    @Kevin G Can you provide a dump of the MySQL schema?

    For the whole db or just the table in question? The table in question is variable, depends on the Excel spreadsheet received.

  13. Duane M

    Jul 6 Pre-Release Testers, Xojo Pro Boston, MA

    I also need to update the records in the table with data returned from the join. Right now I'm going through the returned recordset and using prepared statements. Is there some way to batch prepared statements?

  14. Jean-Yves P

    Jul 6 Pre-Release Testers, Xojo Pro Europe (France, Besancon)

    @Duane M @Jean-YvesPochez make one big string with all the inserts separated with ";" instead of many insert send one at a time.
    This did the trick. For 3000 records it's about 25% of the time. Maybe a little less.

    I found it even better if you group around 1000 statements at a time in a single string.
    if you put too many inserts, it seems to slow down the process.

  15. Kevin G

    Jul 6 Pre-Release Testers, Xojo Pro Gatesheed, England

    @Duane M I'm using InnoDB on MySQL 5.7.x. I haven't tried the memory storage engine.

    For the whole db or just the table in question? The table in question is variable, depends on the Excel spreadsheet received.

    Apologies, I didn't pick up on the fact that your table structure was variable. Some general pointers which may help if the bottleneck is now MySQL rather than the transmission of data:
    1. Innodb works better when the table has a primary key. If you don't have one, it might be worthwhile trying to add one as an autoincrement int).

    2. Innodb primary keys work best if the value is inserted in incrementing order as this is the order that data is stored on the disk. If your primary key isn't incrementing try using one.

    3. Reduce secondary indexes to a minimum (if you don't need it for a query get rid of it).

    4. If possible, try MySQL 5.6 as benchmarking has shown it to be more efficient than MySQL 5.7 in some scenarios.

    5. Try using the MyISAM storage engine as it can be a lot faster at inserting at the loss of transactional functionality.

    6. Experiment with the MySQL options for buffer pool, redo log size, transaction commit mode. These options might not help insert speed but may help other parts of your app.

    7. If you are suffering from IO issues try a faster disk / SSD.

  16. Duane M

    Jul 7 Pre-Release Testers, Xojo Pro Boston, MA

    @Jean-YvesPochez I found it even better if you group around 1000 statements at a time in a single string.

    I'll try that.

    Any idea on my question regarding batching for a prepared statement? Can you batch them into a string or are they one-at-a-time deal?

  17. Duane M

    Jul 7 Pre-Release Testers, Xojo Pro Boston, MA

    @Kevin G Some general pointers which may help if the bottleneck is now MySQL rather than the transmission of data

    Nice suggestions! I do need to improve my database skills. I'm a regular member at the Boston MySQL Meetup and I get a lot out of that. I don't do anything on the scale they talk about. But this is what I like to learn. Thanks!

  18. Jean-Yves P

    Jul 7 Pre-Release Testers, Xojo Pro Europe (France, Besancon)

    @Duane M I'll try that.

    Any idea on my question regarding batching for a prepared statement? Can you batch them into a string or are they one-at-a-time deal?

    sorry no idea I did not try to use this with prepared statements, I did not have to it's most of the time import datas I'm doing so I did not bother using prepared statements.

  19. Jean-Yves P

    Jul 8 Pre-Release Testers, Xojo Pro Europe (France, Besancon)

    try something like this ?

    Dim ps As SQLitePreparedStatement
    
    ps = app.db.Prepare("INSERT INTO Team(Name,Coach,City) VALUES(?,?,?),(?,?,?),(?,?,?)")
    for i as integer = 0 to 8
      ps.BindType(i,3)
    Next
    ps.SQLExecute("Seagulls", "Mike", "Albany","Pigeons", "Mark", "Springfield","Crows", "Matt", "Houston" )

    this is for sqlite but this could work for mysql ?

  20. 2 weeks ago

    Duane M

    Oct 11 Pre-Release Testers, Xojo Pro Boston, MA
    Edited 2 weeks ago by Duane M

    @Norman P
    but not impossible as the join could be written to do a select against mysql, the data inserted to a local in memory db and then exported back to the tab delimited file

    It WOULD save moving all this data to mysql over the network just to toss it all away once the tab delimited file is created

    And that's what I did. I put the data into an SQLite db and also on the MySQL on the server. Then did the join, downloaded the records, and updated the local SQLite database.

    Works great on a Mac. Having issues on Win 10 details here:

    https://forum.xojo.com/43530-sqlite-on-win-10

or Sign Up to reply!