French symbols in utf8

Hello guys ,

I have a sqlite database encoded with UTF8 and it seems that whatever i do and use as encoding i still get those strange ? marks instead of the normal french special letters.

Is there any way to be able to insert those special characters into the utf8 table without having those marks ?

For example in [quote]Exrse de lsion[/quote] i get in the table [quote]Ex?r?se de l?sion[/quote]

if i put the string encoded as ISO.Latin1 than it works ok in the code but in the db it goes still with the signs, and the weird part is that i have a lot of those parts where were created before but i don`t know with what and they look ok, the previous records.

Thanks and any idea is more than welcomed .

If your app is writing the data - it will automatically use utf-8 as the encoding.
You may need to actually define the encoding as utf-8 when you read in and display the data in your listbox, or textfield etc.

Each time you connect to the database, use this code:

db.SQLExecute "SET NAMES 'utf8'"

More information in the second post here:
https://forum.xojo.com/1759-accessing-mysql-on-a-server

Well to be more exact i`m importing the data from a csv file, i noticed if i put lines.Append(tis.ReadLine(Encodings.UTF8)) in the code it reads nice, then in the db part i put [quote]rec.Column(“label”) = recArray(1).DefineEncoding(Encodings.UTF8)[/quote] i still get the same question marks and i did put tempDB.SQLExecute "SET NAMES 'utf8'" after i connect to the db as in the other examples.

If i look into the db that i repeat it is not me who created that with sqlitebrowser it says that the db is encoded with “UTF-8”, but the funny part is that the old records have the correct characters in it .

now, if i put lines.Append(tis.ReadLine(Encodings.ISOLatin1)) in the part where i read the csv the characters are ok, but no matter what i put on the database record i still get those signs.

That drives me crazy .

Is there any other way to fix this issue ? As far as i see the current db being UTF8 encoded has the correct characters in it and me i`m trying to force it to write UTF8 as well and still i get the strange ones.

Thanks

Depending on the type of your database, you might need to read the database like this:

aString = aRecordSet.Field("aValue").StringValue.ConvertEncoding(Encodings.UTF8)

And I believe the correct values are saved in your DB, it is reading the DB that has the problem. The above code will fix that.

[quote=173787:@JrmieLeroy]Depending on the type of your database, you might need to read the database like this:

aString = aRecordSet.Field("aValue").StringValue.ConvertEncoding(Encodings.UTF8)

As poster earlier, i`m using a sqlite database and my issue is not the output of the database , it is the input.

I don`t understand how that i force the encoding to doit one way it does not follow the code, and still i get those strange icons despite the fact that uff8 supports those characters as well .

As said in the earlier posts the original data inputed with other apps is showed properly in the tables, the only issue is that if xojo app is updating the tables i`m getting those wired characters.

if you are on a mac, most of the text files are encoded in “macintosh” instead of “UTF8” when you open them
so you must use : lines.Append(tis.ReadLine(Encodings.MacRoman))
instead of lines.Append(tis.ReadLine(Encodings.UTF8))
then xojo will convert to UTF8 when you save in the UTF8 dtabase.

if you read the data from a file using lines.Append(tis.ReadLine(Encodings.ISOLatin1)) then you have an ISOLatin string
it’s better to use:
lines.Append(tis.ReadLine(Encodings.ISOLatin1)).ConvertEncoding(Encodings.UTF8)
Now you have UTF8 a string and you can save to your sqlite db

When you read back from the db there is no need to convert or define the field value it’s UTF8

MySql is different since you can define UTF8 fields but the record set will deliver you a string without encoding
you have to use aString = aRecordSet.Field(“aValue”).StringValue.DefineEncoding(Encodings.UTF8) with mySQL since there is no info about the current encoding (this is when you connection starts with db.SQLExecute “SET NAMES ‘utf8’”

Thanks all, and Antonio you`ve made my day, thanks a lot, it seems that there was the issue, so it has to be read as latin and converted as utf before inserted in the db.

Thanks.

Hello again, it seems that this encoding issue drives me crazy and i get more and more headache with it , this time i get error on mysql server, all the app is done for mac.

so i have the following :

SELECT id FROM contact WHERE firstname = "Aurelin" AND lastname = "TEST" AND DATE_FORMAT(birthday, "%Y-%m-%d") = "2010-10-10"

if i put that in SequelPro i get 12 records, if i put that in xojo i get nil. What is the right way of doing this without facing same issue again ?

In the code i even state db.SQLExecute "SET NAMES 'utf8'" this before executing the query and even in the query i put firstname.DefineEncoding(Encodings.UTF8) to make sure that it puts the correct part there, but it seems that the wired part are the special characters .

i`m really going crazy here, it suppose to check if the user exist than to take the id and to create a new record for that if and if does not exist to create the new user and to add a record for that user, so in the end i end up with multiple users with same name just because the xojo cannot define that special characters.

Thanks again and hope that this is not a bug and there is a solution for that, i`ve tried all the possible known options and still no luck.

Thanks again in advance.

There is no need to set the define encode.
From what you are writing you are doing it right.
1)As soon you create the connection you call db.sqlExecute “set names ‘utf8’”
2)rs=db.sqlSelect("…") where … is your select string that’s a utf8 string (just to check this you can see it in the debugger)

So if your record set is nil you could see the db.errorMessage to understand what’s is going wring.

If it was an encoding problem then you would have 0 records but rs would be not nil. So there is some other problem.

Aurelin is written with the accent aigu on the wrong e in your query, it is Aurlien.

i know, it was just a test for the encoding to see if it takes those characters.

Well there is no error in the db.error and the funny part is that in SequelPro it shows me the records, and in xojo it comes nil so the issue is there but what exactly is the issue that is a dilemma. Even in the debugger i dont have any errors, just nil. for id, so im using same query identical in both parts.

Check the error message of the database variable:

rs = db.SQLSelect("....") If rs Is Nil Or db.Error Then MsgBox db.ErrorMessage End

Secondly, you need to escape the double quotes in you query:

rs = db.SQLSelect("SELECT id FROM contact WHERE firstname = ""Aurelin"" AND lastname = ""TEST"" AND DATE_FORMAT(birthday, ""%Y-%m-%d"") = ""2010-10-10""")

http://www.xojo.com/blog/en/2013/08/why-are-there-diamonds-in-my-user-interface.php

SequelPro probably isn’t using UTF-8, wheras Xojo is. I suspect your data isn’t actually UTF-8. It’s probably MacRoman.

If your data IS UTF-8, then you have to worry about character composition. Especially with accents, there is more than one way to encode the data in UTF-8.

Hello Tim,
What do you mean by “there is more than one way to encode the data in UTF-8.” ?

Recently i`m facing a lot of problems with this damn encoding . for the current scenario i have 2 mysql servers that have the databases UTF8 , when i make a query for example to retrieve the FirstName and the LastName instead of FirstName=“Aglaé” and LastName=“AGNÈS” i get FirstName=“AglaÈ” and LastName=“AGN»S”

If i pass the recSet(“FirstName”).StringValue in Christian GetEncodingOfString method i get FFFF which means Invalid Encoding so it means that from the mysql server database the data is coming without encoding, “just bag of bytes” to quote one post. so i need to set the encoding for it so i set it as recSet.field(“Prenom”).StringValue.DefineEncoding(Encodings.UTF8) but still useless, i get more strange characters, so what can i do in my case ? the weird thing is that on the database i get the correct data, and the correct name but it keeps duplicating that name because i have to make a select for the id of that matched field and for those strange characters i get nil so it creates a new contact.

Any idea would be more than welcomed.

So far sqlite database works perfectly, from sqlite database to mysql server i transfer the data without any problems, my issue it comes only when i have the app in server mode in which i have 2 mysql databases.

Thanks in advance

Tim wrote: [quote]If your data IS UTF-8, then you have to worry about character composition. Especially with accents, there is more than one way to encode the data in UTF-8.[/quote]

Check up “precomposed” vs. “decomposed”. I always forget which is which. Think “a with umlaut” in UTF8 either as “a`” (2 characters) or as “” (1 character). These are not identical so string comparison won’t work. You should alway normalize your strings. Meaning do everything in one form.

Send SET NAMES ‘utf8’; to the MySQL server after connecting. What you are receiving is probably encoded in Latin-1. Then you can use recSet.field(“Prenom”).StringValue.DefineEncoding(Encodings.UTF8).

The alternative is to use: recSet.field(“Prenom”).StringValue.DefineEncoding(Encodings.ISOLatin1).ConvertEncoding(Encodings.UTF8).