Question about encodings

Hello Xojoers!

I’m getting back to coding after a not so shot hiatus and some things are just not coming back to me, getting older does not help either!

Here is my problem:

I’m attempting to develop a web app which will be used in Spanish using a MySQL db, I wrote a couple of test tables and when I read from them I get an error when attempting to read a record that has accents on it.

The db charset is set to UTF8 and the Collation to utf8_unicode_ci. If I load the data in MySQL Workbench the words with accents show up OK, reading the records into Xojo seems to work fine, at least when I load a String variable with the Column data but when I attempt to put this data into a listtBox I get an error, same happens if I attempt to show the data using a messageBox. I´m aware that this is most probably due to the encoding.

This is a small example of what I mean:

apellidos = rs.Column("apellidos").StringValue
nombres = rs.Column("nombres").StringValue
nombreCompleto = apellidos + ", " + nombres
MessageBox(nombreCompleto)

I get an exception in the MessageBox line:

“Message: The data could not be converted to text with this encoding.”

Yes, this is how genius me came to the conclusion that this is an encoding issue LOL.

Can someone point me in the direction to where I need to read about what I need to do to get this to work? Or even better, can someone tell me if I actually need to convert the encoding on everything I read from the db or could I use a different charset/collation on the db so that Xojo can work with the data from the db without having to convert every single read the app makes from the db?

Any help is much appreciated.

Hi Hector,

I believe that nombreCompleto = apellidos + ", " + nombres is mixing encodings.
Internally, Xojo will use UTF8 for all strings within quotes, in this case the comma is UTF8 and apellidos/nombres have no encoding because you didn’t define it.

This should work:

apellidos = rs.Column("apellidos").StringValue nombres = rs.Column("nombres").StringValue apellidos = DefineEncoding(apellidos, Encodings.UTF8) nombres = DefineEncoding(nombres, Encodings.UTF8) nombreCompleto = apellidos + ", " + nombres MessageBox(nombreCompleto)

Regarding the use of UTF8 as encoding in your database, it might be better to use utf8mb4 if at any time you accept user input.
UTF8 doesn’t handle emojis, utf8mb4 does

Changing to utf8mb4 should have no effect to how Xojo handles the strings, you will still need DefineEncoding(stringValue, Encodings.UTF8)

Thank you for your response Jeremy.

I believe you are correct, this is most probably a mixture of encodings, the test data I’m using was entered using Workbench which is probably using a different encoding.

I tested the code you kindly provided but I’m still getting the same error. I’ll need to read a bit more about this.

I don’t expect users entering anything like: U+01F4A9 PILE OF POO but who knows LOL.

You did pointed me in the direction I need to follow, thanks a lot for your reply!

Just in case this might be of any help to someone

The problem I was encountering was indeed caused because I was using two different sources for the data, I created a test method to insert random data into the MySQL database from XOJO and everything works as expected, accents and tildes are being handled correctly. I suppose that MySQL Workbench is using a different encoding and I was not handling this correctly.