Japanese end-users are having trouble with string.uppercase instructions in my code. “a…z” are not converting to “A…Z”. I suspect a Locale rule for Japanese is being applied to mixed Japanese + Western text. Am I right? Where can I find the Locale rules? (Yes. Read the documentation! But where?)
My solution so far is to write my own uppercase routine, which loops through the string as bytes and changes relevant ones.
Returns a new Text value that has its characters uppercased. If the locale parameter is non-Nil, it will use that locale’s rules when performing the operation."
The ICU (unicode) library controls the characters. Does it also control the uppercase-lowercase pairing?
Following the Link to the “Locale” Documentation, there’s a Link to the UNICode Organization and there you will find the Rules: ICU Demonstration - Locale Explorer
I used a generic English sentence in a plain text file. string.uppercase works fine in my Xojo app on my English-US Windows Desktop computer. “c” changed to “C”.
An end-user in Japan installed the Xojo app in his Windows Desktop computer. Launched the app with my generic sentence. “c” stayed “c”.
I rewrote the app omitting string.uppercase and looped down the string instead, conceptually: asc(midb(…))-asc(“a”)+asc(“A”). Then “c” changed to “C”.for me and in Japan.
Thanks kevin. Same here. I cannot replicate the problem on my computers. The Japanese end-user is in a production environment so just wants the app to work (which it does with my kluge).
Asian Xojo developers: have you experienced any problems with string.uppercase?
Great idea! But the Latin letters are in a string with local text (in this case Japanese), so I cannot assert a particular encoding. This app has end users with many different local scripts.
Yes, it would be great if the world standardized to UTF-8 (and also to one replacement 18 volt rechargeable battery for all battery operated tools and devices), but …
The text is coming from a file which is probably the local language + Latin letters + Arabic numbers. I am not specifying the encoding. If Xojo is assuming UTF-8 then surely uppercase should work correctly for the Latin words. Perhaps Xojo is guessing the encoding (from the Windows settings?), getting it correct for the local language. but wrong for the Latin letters.
I would have thought uppercase and encoding are two different issues. UTF-8 can perfectly well handle most local languages + Latin letters + Arabic numbers (especially since ASCII is included).
But if you can only specify one locale to .UpperCase, then I would expect that characters for that locale only will be what will be uppercased.
Xojo does not attempt to guess the encoding. If you read it from a file and do not specify an encoding, the encoding will be Nil. That probably renders Uppercase useless.