I have some big file with 45k records. I make some calcs on this to search for duplicates and it takes too long.
xojo is stucked at a line rs = mydb.sqlselect(“my sql select”)
the only way to stop it is by force-quitting the application.
even in debug mode, if you stop the debugger, you never come into xojo.
is there a better way, to ask if the user wants to stop the query if it takes too long ?
Is limiting the query and doing multiple queries in a thread out of the question? You may not be able to stop it right away but you’d be able to stop it before the next query begins.
I have some big file with 45k records. I make some calcs on this to search for duplicates and it takes too long.
xojo is stucked at a line rs = mydb.sqlselect(“my sql select”)
the only way to stop it is by force-quitting the application.
even in debug mode, if you stop the debugger, you never come into xojo.
is there a better way, to ask if the user wants to stop the query if it takes too long ?
thanks.[/quote]
You might look into the OFFSET/FETCH commands that many SQL engines support. You ultimately need to limit how many results you return at a time.
Which database engine are you using? You want to pull down a much smaller set of records and then loop through the results. Most SQL engines support offset’s so you can iteratively pull the needed records.
Post an example of the query (or a description), as Greg stated, 45K records should be able to be processed by even the most poorly formed SQL in a matter of seconds . I have had queries that worked on multiple joins across tables with millions of records the ran in under a minute in the worst conditions.
and a key word in what Greg stated is “properly”. Just having indexes for the sake of having them is rarely a good thing
select L.id_article,L.designation from articles L
INNER JOIN
( SELECT designation,count(*) from articles GROUP BY upper(designation) HAVING count(*)>1) R
ON upper(L.designation)=upper(R.designation)
id_article is the primary key, the field “designation” is indexed (and the primary key also)
database is sqlite. it’s a simple sql request to find duplicates.
works fine on 100’s of records, stay stucked on 45k records …
note: there are a lot of duplicates in that 45k records file, may be 30k, which can explain the problem ?
[quote=287585:@Maximilian Tyrtania]Maybe this works better:
L1.id_article,L1.designation from articles L1 join articles L2 on L1.designation=L2.designation and L1.rowid != L2.rowid
You may have to replace rowid with the primary key field for that table. If the designation field is indexed this query should be quite fast.[/quote]
this one works !
ps: I had 43k duplicates in the file … now have some cleaning to do !
Edit: in fact not … if I add upper(l1.designation)=upper(L2.designation) the sqlselect re-stuck.
so it’s defilitely the upper command that slows down the process.
going back to the idea to make an index on upper values.