It’s rumored that you can set the encoding after opening the database connection. I haven’t yet seen this work in practice, but @Anthony_G_Cyphers has given me a new method to try. I plan to investigate more in the coming days.
Try this after connecting:
myDatabase.SQLExecute("set names utf8 collate utf8_general_ci")
myDatabase.SQLExecute("set character set utf8")
Of course, if you’re using a different encoding you’ll want to use that instead. I would recommend using the encoding that’s actually in the database though and not clobbering it to ISOLatin1 just because that happened to work.
ccard_id does in my case, not only contain numbers and is indeed a string.
When you write text into a database, you should tell Xojo which encoding you want to use. Xojo will convert the text to the desired encoding if it isn’t already encoded that way. However, this only works if Xojo knows the encoding of the text. Therefore, when working with databases, you should test multiple times during development to ensure the encoding is always the desired one.
This is correct. You need to tell the database what encoding you’ll be using to communicate after opening a connection.
Without doing db.SQLExecute("SET NAMES 'utf8'") before retrieving data, assuming that the data is actually stored as UTF-8, characters like ® and will get mangled.
And then, of course, the encoding needs to be defined when retrieving it from the RecordSet or RowSet.
dim usageTerms as string = rs.Field("usageterms").StringValue.DefineEncoding(Encodings.UTF8)
The code you provided is exactly what my projects currently have and I have not seen that work. I plan to investigate the two lines Anthony gave me in the coming days.
This is what setting the UTF-8 after connecting is supposed to resolve, because that’s honestly a ridiculous line of code.
I agree that the ridiculous line of code shouldn’t be necessary, but that’s what it took to solve the issue for me, so that’s all I can really say.
I thought that I read somewhere about RowSets handling UTF-8 strings more gracefully, but I still use RecordSets in my applications, so this is what I know works.
The .DefineEncoding(Encodings.UTF8) was not required before 2021. It was changed and I had to update a REST API in hundreds of places. I was not pleased.
It blows my mind that they’ll change the encoding of EncodeBase64 (which is a standard that is definitively ASCII) because it affects one person, but an issue like this that affects far more people doesn’t get attention and is “by design”.
So finally, after starting with a fresh mind this morning, adding the following to my database read method solves all pound sign issues. No further encoding malarky required
(using default MSSQL Server encoding of ISO_Latin1)
Instead of having to put this after every call, I suggest an extension method. Make a module or use one that you already have. Add a global method like this:
Name: ISOLatin1
Parameters: Extends s as String
Returns: String
If s.encoding = Encodings.ISOLatin1 then
Return s
Else If s.encoding = Nil then
Return DefineEncoding(s, Encodings.ISOLatin1)
End If
Return ConvertEncoding(s, Encodings.ISOLatin1)
And regardless of the state of the string, it’ll try to do the right thing. The only thing it doesn’t handle is detecting if a nil encoding is actually ISOLatin1.