Using SQLIte, I’m having a problem getting a SELECT statement to work when the search term includes accented characters. I have a database of models (objects, not people) where one group of items has names like Kerts_2, Kerts_3, and Kerts_4. I know they are in the database correctly because when I use the following SQL my app returns the items properly.
SELECT Name FROM Models WHERE upper(Name) LIKE 'KER%'
But when I use the following SQL, I get no values returned.
SELECT Name FROM Models WHERE upper(Name) LIKE 'KERTS%'
Even if I remove ‘upper()’ from the SQL and replace the LIKE item
SELECT Name FROM Models WHERE Name LIKE 'Kerts%'
I get no returned values.
Just for a test, I pasted the query into the SQLite Database Browser and it failed also. Is SQLite unable to handle items like this? somehow I find that hard to believe so I assume I’m missing something. But what?
BTW, I am using 2019r1.1 on Windows 10 for this project but I go the same results in 2019r3.1.
Hmmm. The SQLIte database is encoded UTF8. Using a hex editor, it looks like the database is storing the accented characters as a single byte. If I look at the SQL statement itself in hex it seems to have the accented characters as double-byte. Maybe I should convert the encoding of the SQL to ANSI before I send the query? That would work but is it the “proper” solution?
I think I’ll try setting it up as a Prepared statement and see what happens.
“SQLite only understands upper/lower case for ASCII characters by default. The LIKE operator is case sensitive by default for unicode characters that are beyond the ASCII range. For example, the expression ‘a’ LIKE ‘A’ is TRUE but ‘’ LIKE ‘’ is FALSE.”
[quote=476663:@Roman Varas]Could this be your issue?
“SQLite only understands upper/lower case for ASCII characters by default. The LIKE operator is case sensitive by default for unicode characters that are beyond the ASCII range. For example, the expression ‘a’ LIKE ‘A’ is TRUE but ‘æ’ LIKE ‘Æ’ is FALSE.”[/quote]
Not in this case. The database is definitely storing accented characters in a single byte (8 bits) so I needed to ensure that I am sending them as 8 bit characters in the SQL statement. Xojo string value had the accented i as HEX(C3 AD) where the database needed it as HEX(ED). Doing the ANSI conversion made the change.
As for characters like ‘æ’ and ‘Æ’, well, I guess I have some more testing to do.
code Case-insensitive matching of Unicode characters does not work.
The default configuration of SQLite only supports case-insensitive comparisons of ASCII characters. The reason for this is that doing full Unicode case-insensitive comparisons and case conversions requires tables and logic that would nearly double the size of the SQLite library. The SQLite developers reason that any application that needs full Unicode case support probably already has the necessary tables and functions and so SQLite should not take up space to duplicate this ability.
Instead of providing full Unicode case support by default, SQLite provides the ability to link against external Unicode comparison and conversion routines. The application can overload the built-in NOCASE collating sequence (using sqlite3_create_collation()) and the built-in like(), upper(), and lower() functions (using sqlite3_create_function()). The SQLite source code includes an “ICU” extension that does these overloads. Or, developers can write their own overloads based on their own Unicode-aware comparison routines already contained within their project.[/code]
You have to load the ICU lib as an SQLite extension. The website from Thomas Tempelman has a prebuilt dylib. It’s not very hard to load that into SQLite.