a question about encoding.

Isn’t there a standard for the names of character encodings?

What are the Xojo equvialents of these character sets?

ISO 2022 IR 13 ISO 2022 IR 6 ISO 2022 IR 87

I think this is perhaps how it should render…
??? ???=???

I was having some luck until I ran into ISO-2022 has different character sets, and you may need to know which one it is: ISO/IEC 2022 - Wikipedia

On Mac specifically you can use the “InternetName” to get an encoding object: http://documentation.xojo.com/index.php/GetInternetTextEncoding

I did that with ISO-2022-JP and got a useable TextEncoding object, but I don’t know how one might obtain it on Windows or Linux.

Wow… That’s awesome! Thanks.

Well I tried it but encoding is always nil…

I’m wondering how to take these microsoft code pages and use them to help select an encoding

50220 iso-2022-jp ISO 2022 Japanese with no halfwidth Katakana; Japanese (JIS) 50221 csISO2022JP ISO 2022 Japanese with halfwidth Katakana; Japanese (JIS-Allow 1 byte Kana) 50222 iso-2022-jp ISO 2022 Japanese JIS X 0201-1989; Japanese (JIS-Allow 1 byte Kana - SO/SI) 50225 iso-2022-kr ISO 2022 Korean 50227 x-cp50227 ISO 2022 Simplified Chinese; Chinese Simplified (ISO 2022)

From a little google, no idea if this is right:

ISO 2022 IR 6 - US-ASCII
ISO 2022 IR 13 - Shift_JIS - JIS X 0201 (Shift JIS) Extended
ISO 2022 IR 87 - ISO-2022-JP - JIS X 0208 (Kanji) Extended

Or could just use EUC-JP for all three and it might auto-select, not sure.


If you are reading dicom files:

Korean “ISO 2022 IR 6\ISO 2022 IR 149” => “ISO-2022-KR”
Chinese “ISO 2022 IR 6\ISO 2022 IR 58” => “ISO-2022-CN”