I have a problem with handling accented characters (or other characters such as the Scandinavian o with a slash through it: ø) in PopupMenus. They work fine throughout the rest of the app, including reading and writing TextFields, Drawing text in a Graphics, writing in a TextArea and saving and reading from an SQLLite database. But in a PopupMenu they do not work (I am filling the menu using AddRow).
The one I aim having trouble with at the moment is the lower case o with slash, which is given as hex C3B8 (2 bytes!) in the debugger.
The first time enc is set, it is Nil
The second time it is Nil as well (and it is here we can see the character showing up as C3B8)
The third time it has a value, with the InternetName being US-ASCII
If I change the second line so that the .ConvertEncodings part is removed, the result (including the C3B8 representation) is identical.
What is going on and how can I get PopupMenu to behave like the rest of the app?
If the encoding is Nil then ConvertEncoding won’t know how to convert it to UTF-8.
If the data is valid UTF-8 but for some reason doesn’t have an encoding try using DefineEncoding instead of ConvertEncoding. That should really be done when the data is read into Xojo though rather than when adding it to the popup menu.
Thanks for the response, I’ll try to work with that.
But please note (I probably wasn’t sufficiently clear before) that these data are not read into Xojo, they are entered as text into a Textfield and then used. They go around the houses a bit, but it’s all internal.
I notice that there was a now-closed thread on this same topic back in 2019 which never had a satisfactory resolution.
I wonder if there is something weird with PopupMenus?
Keep in mind that if you just make a simple assignment anywhere in your app (such as MyProperlyEncodedString=MyProperlyEncodedString+AWeirdString), the encoding can fairly be lost. It’s easy to encounter this situation.
Look at wikipedia what is ASCII: 7 bits defined characters; essentialli a-z, A-Z,0-9 and some other 1 Byte characters. Diacritic vowels and other ø are not (and never were) ASCII.
And, unlike what some wrote on the Internet, “Extended ASCII” never existed too.
ASCII goes from 00 to 127 (hex 00 to 7F). Anything between hex 7F and FF is NOT ASCII, it’s someone’s attempt to have an extended character set containing their idea of other common characters, with the whole only occupying one byte each. Sensible people ignore these, and only use UTF-8. ASCII forms the first 128 chars of UTF-8 and are the only one-byte characters in UTF-8. Other characters are 2, 3, or 4 bytes long.
If you read strings from anywhere and get a nil encoding you should use define encoding to tell Xojo what is in use. If you then want UTF-8 and haven’t got it you can only then use ConvertEncoding to change the string as desired. For example if you read a string in WindowsLatin1 and want UTF-8 output, and the source fails to identify it as WindowsLatin1. You would so:
Var myUTF8String as string = SourceString.DefineEncoding( Encodings.WindowsLatin1 ).ConvertEncoding( Encodings.UTF8 )
But please note (I probably wasn’t sufficiently clear before) that these data are not read into Xojo, they are entered as text into a Textfield and then used. They go around the houses a bit, but it’s all internal.
He does not read the characters from a file or internet or siri/cortana/vanessa/whoever.
I struggled for a long time to get the Swedish characters ÄÅÖ, from CSV file to Listbox. This solved my problem.
st = split(s.DefineEncoding( Encodings.WindowsLatin1 ).ConvertEncoding( Encodings.UTF8 ), EndOfLine)
What works for me is just to have constants for the “out of ASCII” characters that I might need. I just store them in some module so they are global. I can use a name that resonates with me rather than the “official name”.
Const SWEDISH_LOWER_O As String = "ø"
Const SWEDISH_UPPER_O As String = "Ø"
Then I just construct the strings that I need for items in the IDE or elsewhere.
It is all UTF8 and I do not have to convert anything or worry about how many bytes etc.