SQLiteDatabase.dylib hard crash on SELECT *

You can try this one that is free DB Browser for SQLite

Thank you, that helps :slight_smile: This app opens the DB and loads the table with no problem, allowing me to browse all rows.

14582700 rows returned in 1094ms

I don’t see any tool for probing memory within DB Browser, but the app itself with all rows of the DB loaded is taking up only 382.7 MB in Activity Monitor.

Maybe a LIMIT clause in the SELECT statement can helps?

I’ve made a test project and filed a Feedback report: <https://xojo.com/issue/55116>

Wait a sec guys. Aaron, you said that this crash does NOT occur if you don’t muck with the cache_size pragma, right? I still think that’s your culprit.

Actually this is what he said:

“It happens both with and without the cache pragma. Forgot to mention that I added the cache statement thinking it might solve the problem.”

[quote=427431:@Aaron Hunt]Thank you, that helps :slight_smile: This app opens the DB and loads the table with no problem, allowing me to browse all rows.

14582700 rows returned in 1094ms

I don’t see any tool for probing memory within DB Browser, but the app itself with all rows of the DB loaded is taking up only 382.7 MB in Activity Monitor.[/quote]
This would indicate that DB Browser is paging the table - not really loading all records into memory at one time. See below for my test results.

I found some time to build my own test db and app for this. I created a sqlite db consisting of one table with an integer primary key and three text fields. I filled this table with 16+ million rows of random data. Each row is about 135 bytes and the entire db is 2.2 GB. The app simply selects all the rows into a recordset. On my MB Pro it completed successfully, taking 24 seconds to retrieve all the rows.

I also compiled the app for Windows and ran it in my VM and got the same result (well, took a couple of seconds longer, but it is in a VM).

So, it doesn’t look like a problem with Xojo or the Sqlite library it’s using. Must be something in your app or your database. I see that Xojo verified the problem with your test app. I wonder if you could share how you built your test db?

[quote=427603:@Jay Madren]I found some time to build my own test db and app for this. I created a sqlite db consisting of one table with an integer primary key and three text fields. I filled this table with 16+ million rows of random data. Each row is about 135 bytes and the entire db is 2.2 GB. The app simply selects all the rows into a recordset. On my MB Pro it completed successfully, taking 24 seconds to retrieve all the rows.

I also compiled the app for Windows and ran it in my VM and got the same result (well, took a couple of seconds longer, but it is in a VM).

So, it doesn’t look like a problem with Xojo or the Sqlite library it’s using. Must be something in your app or your database. I see that Xojo verified the problem with your test app. I wonder if you could share how you built your test db?[/quote]

Thanks for taking the time to test. It’s definitely a problem in the dylib, not in my code. The table structure is fine, smaller tables load without issue. If your DB works, it could provide clues for the Xojo team to fix the problem, so please add your app to (https://xojo.com/issue/55116)]Feedback case #55116. The test app I made does nothing special, just creates a DB table and then attempts to load it. I made the project private because I’m working for a client and the DB structure isn’t my property. (Above I offered to send a DB file but I hadn’t thought that through, I wouldn’t have been able to do it without an NDA.)

My test shows that it’s not necessarily an issue with the size of the table (either # rows or # bytes), which is why I suspect it has something to do with your particular table. I don’t think my app does anything different than yours - just load a Select * into a recordset. But I used a third party tool to fill my table with random data, so I can’t really submit my test to Xojo in a way that would be useful to them. Besides, I described the table structure above, so it’s easy for anyone else to reproduce.

I understand your need for privacy, but I don’t see any harm (to your client) in posting a generic version of the table structure, or at least a list of all the column types. I’m just looking for a way to get my test to fail like yours.

Alternatively, have you tried creating a separate test table with a different (maybe simpler) structure, like mine above, and test using that? This could help to pinpoint the condition(s) for the failure.

Silly question:
Why are you selecting all fields from all rows of a multimillion row table?
Do you intend to .movenext through all the records one by one in a recordset?
You certainly dont want to be putting that number of rows into a listbox.

I would be surprised if you cant make SQL do the work you intend to do.

Example, if you need a sum of money, use a select sum() where…
if you want to find all surnames that start with B, select * where surname like ‘B%’

basically: what do you want to do with 14.6 million records?

I think what I’d do at this point is see if this might have something to do with some data in some field or another. I’d first do

dim rs As RecordSet = db.SQLSelect("SELECT rowid FROM myTableName")

If that fails, I would assume it’s not the data in any particular field. If it succeeds, I’d test each field. Maybe do a handful at a time, narrowing them down on failure:

dim rs As RecordSet = db.SQLSelect("SELECT FirstName, MiddleName, LastName, StreetAddress FROM myTableName")

If that succeeds, do some other fields until failure. Else, If that fails, narrow down the selection of fields:

dim rs As RecordSet = db.SQLSelect("SELECT LastName, StreetAddress FROM myTableName")

etc.

[quote=427654:@Jeff Tullin]Silly question:
Why are you selecting all fields from all rows of a multimillion row table?[/quote]

It’s a required function of the app.

[quote]Do you intend to .movenext through all the records one by one in a recordset?
You certainly dont want to be putting that number of rows into a listbox.[/quote]

Not to worry, I’m using Kem Tekinay’s DataOnDemand listbox which manages the displayed data dynamically.

That’s confidential. If we’re able to solve this problem, the entire database will in the end be around 1,5 billion records but the most that need to be loaded at any one time is about 185 million.

Thanks, but it’s not the data. If a query is done on individual fields, all records are successfully returned. Only * fails.

That solves the listbox problem.

But honestly, I doubt you need to select all fields from all records.
That’s not a customer-specified requirement, it’s what a programmer chooses to do.
And while it works for small tables, it doesn’t scale to 1.5billion rows.
Im not going to nag, but take a step back and ask ‘what do I need right now’ , rather than ‘what is the smallest piece of code that gets me all the data I will ever need’

Actually, it is. And this is a bug in the .dylib, so blaming me won’t work :slight_smile: There are workarounds, but that’s not the point. The thing should not crash, period.

I’m still not convinced.

I’ve just done a bunch of tests and have made a some observations based on the project on the submitted feedback case:

  1. The size of the result sets that crash on my machine are right around 1GB. That is, 13107199 records with a size of ~81 bytes per record.
  2. I did a test and inserted the same data into Max1 and Max2 as was put in Points and the results were nearly identical.

It seems to me that sqlite may have a 1GB memory limit on returned recordsets, although I can find no documentation to that effect.

FWIW, You can get around this by doing multiple queries using the LIMIT and OFFSET directives to return groups of records at a time, so instead of:

SELECT * FROM S67

You would use:

SELECT * FROM S67 LIMIT 10000000 OFFSET 0;

and increase the offset by the number of records you requested the previous time.

SELECT * FROM S67 LIMIT 10000000 OFFSET 10000000;
SELECT * FROM S67 LIMIT 10000000 OFFSET 20000000;

My test I described earlier was much larger than that, 16 million rows at around 135 bytes each, 2.2 GB total table size. No crash.

I have since increased it to over 28 million rows with an average of 250 bytes each, or 7.2 GB table size. Still no crash. It takes 57 seconds to “Select *” and the app memory footprint increases to 7.5 GB, so it definitely is loading all rows into memory.

As I stated before, it must have something to do with the structure of his table, or maybe even the actual data.

[quote=427722:@Jay Madren]My test I described earlier was much larger than that, 16 million rows at around 135 bytes each, 2.2 GB total table size. No crash.

I have since increased it to over 28 million rows with an average of 250 bytes each, or 7.2 GB table size. Still no crash. It takes 57 seconds to “Select *” and the app memory footprint increases to 7.5 GB, so it definitely is loading all rows into memory.

As I stated before, it must have something to do with the structure of his table, or maybe even the actual data.[/quote]
But you stated that you were using tools other than Xojo for your tests. I’m trying to get to the bottom of the Aaron’s problem in Xojo. Your information is helpful in that it shows that it’s possible with SQLite, but there’s obviously something different about our plugin.

@Jay Madren
In case it’s relevant… Do any of those other tools use encryption that is compatible with our plugin?