utf8 string encoding

Hi!
Is there a slick way to display UTF-8 encoded strings properly in something like a ListBox?

This is the raw string in memory:

This is what it should look like (ignore the capitol P)

And this is what it ends up looking like:

Thanks!

The encoding for your string is either not set (nil), or not set correctly. In either case, before you display it in the Listbox (or anywhere, for that matter), be sure to define the encoding.

s = s.DefineEncoding( Encodings.UTF8 )

It always seems to just work for me on Win:

Make sure it is UTF8 encoded.

I did that – but it wasn’t working.

Here is what it looks like in the properties:

text

hex

It’s like it’s converting it twice… since we end up with 4 bytes for 2 (wrong) chars

For more insight, the string originally looked like this:

prop%C3%B3sito

I ran it through a URLDecode where it simply converts the escaped hex to chars.

Is there something like in PHP available somewhere?

echo utf8_decode(urldecode("Ant%C3%B4nio+Carlos+Jobim")); Output: "Antnio Carlos Jobim".

You need to first define the correct encoding and then you can convert it to another one. Where does the string come from (database, text file, socket, declare)?

Ah, that sheds light. The DecodeURLComponent must be interpreting those as code points, not byte values, so it is encoding the characters &uC3 ("") and &uB3 (""), and that’s what gets displayed.

This this:

s = DecodeURLComponent( inputText, Encodings.UTF8 )

Bam. That was it. I didn’t know that function existed.

I wasted 3 hours on this! ha!

thanks a lot!