MySQL, transaction, and alot of records

I’m trying to understand how to improve the speed with which a few thousand records can be inserted into a MySQL table using 1 transaction. The circumstances are that there are about 10 users, they receive an Excel file with an arbitrary number of columns, they do a little preparation, and save it to a tab delimited file. The first row of the file are the column names and the remainder are data rows. As it is now the data is read one row at a time and inserted using an INSERT statement. Prepared statements can’t be used because of the variable columns.

The above is done using a temporary table. There is never any reason to save this table as permanent. Two more operations are then performed on the temporary table. When these operations are done the table goes away. First some columns are added to the table and next a join is performed to bring data into the new columns. A list box is populated and the user then downloads the data to a tab delimited file. There is no need to keep any of this data in the database so temporary tables work well. They just go away.

I believe this comes down to how I structure the code to make this work correctly. Here is my pseudo code describing where I’m at.

[code] db.SQLExecute(“START TRANSACTION”)

While Not input.EOF   'read the data file
 
  If vLineNumber=1 Then   'this is the column names, read them and create the table
    
    //  CREATE THE TEMPORARY TABLE
    
  Else
    //  ADD THE INDIVIDUAL RECORDS
    // REPEAT THROUGH EACH ROW, READING AND INSERTING
    line = input.ReadLine 'should be line 2 the first time
   
  End If
Wend
db.Commit[/code]

In the current structure the transaction begins before the table is created. I don’t think that’s right. And I’m reading one line at a time. Maybe I need to restructure the file reading to read it twice. Once for the header row, close it, and then open it again and bang through the inserts beginning at row 2.

Thanks for any suggestions on how to insert a table of data using transaction.

you must have as low data going from-to the server as possible.
the trick is to read all the things you need in one sql query, then read the recordset locally.
it is much much faster to read 1 sqlquery with 1000 records in it than 1000 sql queries with 1 record in the recordset.

it is the same for the insert statements. make one big string with all the inserts separated with “;” instead of many insert send one at a time.
this way I can send 1000 records to a postgres server that is on a lousy 1Mb adsl line in less than 5 seconds.

Why reinvent what already exists? See LOAD DATA INFILE.

because he already did try this ?
https://forum.xojo.com/37052-mysql-load-data-local-infile/p1#

Well I hesitate to suggest this BUT have you tried without a transaction ?
I prefer using one because then you can be sure that everything gets inserted - or not

Having the create table in the transaction shouldn’t be an issue - it isn’t in most db’s

Yes, I have tried without transaction and it’s slow. But I think Jean-Yves has pointed me in the right direction. I’ll create all the inserts in a string variable, each terminated by a “;”, and then create the transaction and do the mass insert using SQLExecute.

On Windows, you can install an ODBC driver that exposes an Excel worksheet as a table.
No inserts required but record access afterwards is slower

Unfortunately not all the users are on Windows. Mostly Mac users here.

Yes, the LOAD DATA INFILE did not work out.

I would use an in-memory sqlite database instead.

that might be tough if the data used to do his joins is in mysql

but not impossible as the join could be written to do a select against mysql, the data inserted to a local in memory db and then exported back to the tab delimited file

It WOULD save moving all this data to mysql over the network just to toss it all away once the tab delimited file is created

What storage engine are you using?
Have you tried using the memory storage engine in MySQL?
Can you provide a dump of the MySQL schema?

This did the trick. For 3000 records it’s about 25% of the time. Maybe a little less.

That would be a good idea but the join would make for more work as Norman points out. The join performance isn’t bad. This could be a good sell point for an updated version. I’d enjoy trying this but need to get this delivered.

[quote=339291:@Kevin Gale]What storage engine are you using?
Have you tried using the memory storage engine in MySQL?[/quote]
I’m using InnoDB on MySQL 5.7.x. I haven’t tried the memory storage engine.

For the whole db or just the table in question? The table in question is variable, depends on the Excel spreadsheet received.

I also need to update the records in the table with data returned from the join. Right now I’m going through the returned recordset and using prepared statements. Is there some way to batch prepared statements?

[quote=339303:@Duane Mitchell]
@Jean-Yves Pochez make one big string with all the inserts separated with “;” instead of many insert send one at a time.
This did the trick. For 3000 records it’s about 25% of the time. Maybe a little less.[/quote]
I found it even better if you group around 1000 statements at a time in a single string.
if you put too many inserts, it seems to slow down the process.

[quote=339303:@Duane Mitchell]
I’m using InnoDB on MySQL 5.7.x. I haven’t tried the memory storage engine.

For the whole db or just the table in question? The table in question is variable, depends on the Excel spreadsheet received.[/quote]

Apologies, I didn’t pick up on the fact that your table structure was variable. Some general pointers which may help if the bottleneck is now MySQL rather than the transmission of data:

  1. Innodb works better when the table has a primary key. If you don’t have one, it might be worthwhile trying to add one as an autoincrement int).

  2. Innodb primary keys work best if the value is inserted in incrementing order as this is the order that data is stored on the disk. If your primary key isn’t incrementing try using one.

  3. Reduce secondary indexes to a minimum (if you don’t need it for a query get rid of it).

  4. If possible, try MySQL 5.6 as benchmarking has shown it to be more efficient than MySQL 5.7 in some scenarios.

  5. Try using the MyISAM storage engine as it can be a lot faster at inserting at the loss of transactional functionality.

  6. Experiment with the MySQL options for buffer pool, redo log size, transaction commit mode. These options might not help insert speed but may help other parts of your app.

  7. If you are suffering from IO issues try a faster disk / SSD.

I’ll try that.

Any idea on my question regarding batching for a prepared statement? Can you batch them into a string or are they one-at-a-time deal?

Nice suggestions! I do need to improve my database skills. I’m a regular member at the Boston MySQL Meetup and I get a lot out of that. I don’t do anything on the scale they talk about. But this is what I like to learn. Thanks!

[quote=339425:@Duane Mitchell]I’ll try that.

Any idea on my question regarding batching for a prepared statement? Can you batch them into a string or are they one-at-a-time deal?[/quote]
sorry no idea I did not try to use this with prepared statements, I did not have to it’s most of the time import datas I’m doing so I did not bother using prepared statements.

try something like this ?

[code]Dim ps As SQLitePreparedStatement

ps = app.db.Prepare(“INSERT INTO Team(Name,Coach,City) VALUES(?,?,?),(?,?,?),(?,?,?)”)
for i as integer = 0 to 8
ps.BindType(i,3)
Next
ps.SQLExecute(“Seagulls”, “Mike”, “Albany”,“Pigeons”, “Mark”, “Springfield”,“Crows”, “Matt”, “Houston” )[/code]

this is for sqlite but this could work for mysql ?

[quote]@Norman Palardy
but not impossible as the join could be written to do a select against mysql, the data inserted to a local in memory db and then exported back to the tab delimited file

It WOULD save moving all this data to mysql over the network just to toss it all away once the tab delimited file is created[/quote]
And that’s what I did. I put the data into an SQLite db and also on the MySQL on the server. Then did the join, downloaded the records, and updated the local SQLite database.

Works great on a Mac. Having issues on Win 10 details here:

https://forum.xojo.com/43530-sqlite-on-win-10