Encodings Enum with Descriptive Labels

In case anyone needs a readable descriptive name for an encoding other than what the InternetName can provide, I researched and put together what I believe to be the “common” descriptive names for the Encodings enum that Xojo supports.

Lower down in my post is a TextEncoding Extension Method to get the enum name or the “longName” (descriptive label). As well, there is a screenshot of how a drop-down of encoding choices could look.

Maybe this information is already available in the documentation (or from another single source elsewhere on the net), but if so, I couldn’t find it.

Encodings enum item Code value Descriptive label
ASCII 1536 Western (ASCII)
DOSArabic 1049 Arabic (DOS)
DOSBalticRim 1030 Baltic (DOS)
DOSCanadianFrench 1048 Canadian French (DOS)
DOSChineseSimplif 1057 Chinese Simplified (GBK, CP-936)
DOSChineseTrad 1059 Chinese Traditional (Big 5)
DOSCyrillic 1043 Cyrillic (DOS)
DOSGreek 1029 Greek (DOS)
DOSGreek1 1041 Greek (DOS Greek 1)
DOSGreek2 1052 Greek (DOS Greek 2)
DOSHebrew 1047 Hebrew (DOS)
DOSIcelandic 1046 Icelandic (DOS)
DOSJapanese 1056 Japanese (Windows, DOS)
DOSKorean 1058 Korean (EUC)
DOSLatin1 1040 Western (DOS Latin 1, CP-850)
DOSLatin2 1042 Central European (DOS Latin 2, CP-852)
DOSLatinUS 1024 Latin-US (DOS, CP-437)
DOSNordic 1050 Nordic (DOS)
DOSPortuguese 1045 Portuguese (DOS)
DOSRussian 1051 Russian (DOS)
DOSThai 1053 Thai (DOS)
DOSTurkish 1044 Turkish (DOS)
ISOLatin1 513 Western (ISO Latin 1, ISO 8859-1)
ISOLatin2 514 Central European (ISO Latin 2)
ISOLatin3 515 Western (ISO Latin 3)
ISOLatin4 516 Central European (ISO Latin 4)
ISOLatin5 521 Turkish (ISO Latin 5)
ISOLatin6 522 Nordic (ISO Latin 6)
ISOLatin7 525 Baltic (ISO Latin 7)
ISOLatin8 526 Celtic (ISO Latin 8)
ISOLatin9 527 Western (ISO Latin 9)
ISOLatinArabic 518 Arabic (ISO 8859-6)
ISOLatinCyrillic 517 Cyrillic (ISO 8859-5)
ISOLatinGreek 519 Greek (ISO 8859-7)
ISOLatinHebrew 520 Hebrew (ISO 8859-8)
KOI8_R 2562 Cyrillic (KOI8-R)
MacArabic 4 Arabic (Mac OS)
MacArmenian 24 Armenian (Mac OS)
MacBengali 13 Bengali (Mac OS)
MacBurmese 19 Burmese (Mac OS)
MacCeltic 39 Celtic (Mac OS)
MacCentralEurRoman 29 Central European (Mac OS)
MacChineseSimp 25 Chinese Simplified (Mac OS)
MacChineseTrad 2 Chinese Traditional (Mac OS)
MacCroatian 36 Croatian (Mac OS)
MacCyrillic 7 Cyrillic (Mac OS
MacDevanagari 9 Devanagari (Mac OS)
MacDingbats 34 Dingbats (Mac OS)
MacEthiopic 28 Ethiopic (Mac OS)
MacExtArabic 31 Arabic Extended (Mac OS)
MacGaelic 40 Gaelic (Mac OS)
MacGeorgian 23 Georgian (Mac OS)
MacGreek 6 Greek (Mac OS)
MacGujarati 11 Gujarati (Mac OS)
MacGurmukhi 10 Gurmukhi (Mac OS)
MacHebrew 5 Hebrew (Mac OS)
MacIcelandic 37 Icelandic (Mac OS)
MacJapanese 1 Japanese (Mac OS)
MacKannada 16 Kannada (Mac OS)
MacKhmer 20 Khmer (Mac OS)
MacKorean 3 Korean (Mac OS)
MacLaotian 22 Laotian (Mac OS)
MacMalayalam 17 Malayalam (Mac OS)
MacMongolian 27 Mongolian (Mac OS)
MacOriya 12 Oriya (Mac OS)
MacRoman 0 Western (Mac OS Roman)
MacRomanian 38 Romanian (Mac OS)
MacRomanLatin1 2564 Western (Mac Mail)
MacSinhalese 18 Sinhalese (Mac OS)
MacSymbol 33 Symbol (Mac OS)
MacTamil 14 Tamil (Mac OS)
MacTelugu 15 Telugu (Mac OS)
MacThai 21 Thai (Mac OS)
MacTibetan 26 Tibetan (Mac OS)
MacTurkish 35 Turkish (Mac OS)
MacVietnamese 30 Vietnamese (Mac OS)
ShiftJIS 2561 Japanese (Shift JIS)
SystemDefault 0 (default is MacRoman on macOS)
UTF16 256 Unicode (UTF-16)
UTF16BE 268435712 Unicode (UTF-16 Big-Endian)
UTF16LE 335544576 Unicode (UTF-16 Little-Endian)
UTF32 201326848 Unicode (UTF-32)
UTF32BE 402653440 Unicode (UTF-32 Big-Endian)
UTF32LE 469762304 Unicode (UTF-32 Little-Endian)
UTF8 134217984 Unicode (UTF-8)
WindowsANSI 1280 Western (Windows Latin 1, CP-1252)
WindowsArabic 1286 Arabic (Windows)
WindowsBalticRim 1287 Baltic (Windows)
WindowsCyrillic 1282 Cyrillic (Windows, CP-1251)
WindowsGreek 1283 Greek (Windows, CP-1253)
WindowsHebrew 1285 Hebrew (Windows)
WindowsKoreanJohab 1296 Korean (Windows Johab)
WindowsLatin1 1280 Western (Windows Latin 1)
WindowsLatin2 1281 Central European (Windows Latin 2)
WindowsLatin5 1284 Turkish (Windows Latin 5)
WindowsVietnamese 1288 Vietnamese (Windows)

Extension Method TextEncoding.ToString()

Public Function ToString(Extends enc As TextEncoding, Optional longName As Boolean) As String
  
  // 96 entries, including SystemDefault
  // comments on right are the .InternetName values
  Select Case enc
  Case Encodings.ASCII
    Return If(longName, "Western (ASCII)", "ASCII") // US-ASCII
    
  Case Encodings.UTF8
    Return If(longName, "Unicode (UTF-8)", "UTF8") // UTF-8
    
  Case Encodings.UTF16
    Return If(longName, "Unicode (UTF-16)", "UTF16") // UTF-16
    
  Case Encodings.UTF16BE
    Return If(longName, "Unicode (UTF-16 Big-Endian)", "UTF16BE") // UTF-16BE
    
  Case Encodings.UTF16LE
    Return If(longName, "Unicode (UTF-16 Little-Endian)", "UTF16LE") // UTF-16LE
    
  Case Encodings.UTF32
    Return If(longName, "Unicode (UTF-32)", "UTF32") // UTF-32
    
  Case Encodings.UTF32BE
    Return If(longName, "Unicode (UTF-32 Big-Endian)", "UTF32BE") // UTF-32BE
    
  Case Encodings.UTF32LE
    Return If(longName, "Unicode (UTF-32 Little-Endian)", "UTF32LE") // UTF-32LE
    
    // following is in alphabetical order
  Case Encodings.DOSArabic
    Return If(longName, "Arabic (DOS)", "DOSArabic") // cp864
    
  Case Encodings.DOSBalticRim
    Return If(longName, "Baltic (DOS)", "DOSBalticRim") // cp775
    
  Case Encodings.DOSCanadianFrench
    Return If(longName, "Canadian French (DOS)", "DOSCanadianFrench") // cp863
    
  Case Encodings.DOSChineseSimplif
    Return If(longName, "Chinese Simplified (GBK, CP-936)", "DOSChineseSimplif") // GBK
    
  Case Encodings.DOSChineseTrad
    Return If(longName, "Chinese Traditional (Big 5)", "DOSChineseTrad") // Big5
    
  Case Encodings.DOSCyrillic
    Return If(longName, "Cyrillic (DOS)", "DOSCyrillic") // cp855
    
  Case Encodings.DOSGreek
    Return If(longName, "Greek (DOS)", "DOSGreek") // cp737
    
  Case Encodings.DOSGreek1
    Return If(longName, "Greek (DOS Greek 1)", "DOSGreek1") // IBM851
    
  Case Encodings.DOSGreek2
    Return If(longName, "Greek (DOS Greek 2)", "DOSGreek2") // IBM869
    
  Case Encodings.DOSHebrew
    Return If(longName, "Hebrew (DOS)", "DOSHebrew") // DOS-862
    
  Case Encodings.DOSIcelandic
    Return If(longName, "Icelandic (DOS)", "DOSIcelandic") // cp861
    
  Case Encodings.DOSJapanese
    Return If(longName, "Japanese (Windows, DOS)", "DOSJapanese") // Shift_JIS
    
  Case Encodings.DOSKorean
    Return If(longName, "Korean (EUC)", "DOSKorean") // EUC-KR
    
  Case Encodings.DOSLatin1
    Return If(longName, "Western (DOS Latin 1, CP-850)", "DOSLatin1") // cp850
    
  Case Encodings.DOSLatin2
    Return If(longName, "Central European (DOS Latin 2, CP-852)", "DOSLatin2") // cp852
    
  Case Encodings.DOSLatinUS
    Return If(longName, "Latin-US (DOS, CP-437)", "DOSLatinUS") // cp437
    
  Case Encodings.DOSNordic
    Return If(longName, "Nordic (DOS)", "DOSNordic") // cp865
    
  Case Encodings.DOSPortuguese
    Return If(longName, "Portuguese (DOS)", "DOSPortuguese") // cp860
    
  Case Encodings.DOSRussian
    Return If(longName, "Russian (DOS)", "DOSRussian") // cp866
    
  Case Encodings.DOSThai
    Return If(longName, "Thai (DOS)", "DOSThai") // TIS-620
    
  Case Encodings.DOSTurkish
    Return If(longName, "Turkish (DOS)", "DOSTurkish") // cp857
    
  Case Encodings.ISOLatin1
    Return If(longName, "Western (ISO Latin 1, ISO 8859-1)", "ISOLatin1") // ISO-8859-1
    
  Case Encodings.ISOLatin2
    Return If(longName, "Central European (ISO Latin 2)", "ISOLatin2") // ISO-8859-2
    
  Case Encodings.ISOLatin3
    Return If(longName, "Western (ISO Latin 3)", "ISOLatin3") // ISO-8859-3
    
  Case Encodings.ISOLatin4
    Return If(longName, "Central European (ISO Latin 4)", "ISOLatin4") // ISO-8859-4
    
  Case Encodings.ISOLatin5
    Return If(longName, "Turkish (ISO Latin 5)", "ISOLatin5") // ISO-8859-9
    
  Case Encodings.ISOLatin6
    Return If(longName, "Nordic (ISO Latin 6)", "ISOLatin6") // ISO-8859-10
    
  Case Encodings.ISOLatin7
    Return If(longName, "Baltic (ISO Latin 7)", "ISOLatin7") // ISO-8859-13
    
  Case Encodings.ISOLatin8
    Return If(longName, "Celtic (ISO Latin 8)", "ISOLatin8") // ISO-8859-14
    
  Case Encodings.ISOLatin9
    Return If(longName, "Western (ISO Latin 9)", "ISOLatin9") // ISO-8859-15
    
  Case Encodings.ISOLatinArabic
    Return If(longName, "Arabic (ISO 8859-6)", "ISOLatinArabic") // ISO-8859-6-I
    
  Case Encodings.ISOLatinCyrillic
    Return If(longName, "Cyrillic (ISO 8859-5)", "ISOLatinCyrillic") // ISO-8859-5
    
  Case Encodings.ISOLatinGreek
    Return If(longName, "Greek (ISO 8859-7)", "ISOLatinGreek") // ISO-8859-7
    
  Case Encodings.ISOLatinHebrew
    Return If(longName, "Hebrew (ISO 8859-8)", "ISOLatinHebrew") // ISO-8859-8-I
    
  Case Encodings.KOI8_R
    Return If(longName, "Cyrillic (KOI8-R)", "KOI8_R") // KOI8-R
    
  Case Encodings.MacArabic
    Return If(longName, "Arabic (Mac OS)", "MacArabic") // X-MAC-ARABIC
    
  Case Encodings.MacArmenian
    Return If(longName, "Armenian (Mac OS)", "MacArmenian") // X-MAC-ARMENIAN
    
  Case Encodings.MacBengali
    Return If(longName, "Bengali (Mac OS)", "MacBengali") // X-MAC-BENGALI
    
  Case Encodings.MacBurmese
    Return If(longName, "Burmese (Mac OS)", "MacBurmese") // X-MAC-BURMESE
    
  Case Encodings.MacCeltic
    Return If(longName, "Celtic (Mac OS)", "MacCeltic") // MacCeltic
    
  Case Encodings.MacCentralEurRoman
    Return If(longName, "Central European (Mac OS)", "MacCentralEurRoman") // X-MAC-CE
    
  Case Encodings.MacChineseSimp
    Return If(longName, "Chinese Simplified (Mac OS)", "MacChineseSimp") // GB2312
    
  Case Encodings.MacChineseTrad
    Return If(longName, "Chinese Traditional (Mac OS)", "MacChineseTrad") // Big5
    
  Case Encodings.MacCroatian
    Return If(longName, "Croatian (Mac OS)", "MacCroatian") // X-MAC-CROATIAN
    
  Case Encodings.MacCyrillic
    Return If(longName, "Cyrillic (Mac OS", "MacCyrillic") // X-MAC-CYRILLIC
    
  Case Encodings.MacDevanagari
    Return If(longName, "Devanagari (Mac OS)", "MacDevanagari") // X-MAC-DEVANAGARI
    
  Case Encodings.MacDingbats
    Return If(longName, "Dingbats (Mac OS)", "MacDingbats") // X-MAC-DINGBATS
    
  Case Encodings.MacEthiopic
    Return If(longName, "Ethiopic (Mac OS)", "MacEthiopic") // X-MAC-ETHIOPIC
    
  Case Encodings.MacExtArabic
    Return If(longName, "Arabic Extended (Mac OS)", "MacExtArabic") // X-MAC-EXTARABIC
    
  Case Encodings.MacGaelic
    Return If(longName, "Gaelic (Mac OS)", "MacGaelic") // MacGaelic
    
  Case Encodings.MacGeorgian
    Return If(longName, "Georgian (Mac OS)", "MacGeorgian") // X-MAC-GEORGIAN
    
  Case Encodings.MacGreek
    Return If(longName, "Greek (Mac OS)", "MacGreek") // X-MAC-GREEK
    
  Case Encodings.MacGujarati
    Return If(longName, "Gujarati (Mac OS)", "MacGujarati") // X-MAC-GUJARATI
    
  Case Encodings.MacGurmukhi
    Return If(longName, "Gurmukhi (Mac OS)", "MacGurmukhi") // X-MAC-GURMUKHI
    
  Case Encodings.MacHebrew
    Return If(longName, "Hebrew (Mac OS)", "MacHebrew") // X-MAC-HEBREW
    
  Case Encodings.MacIcelandic
    Return If(longName, "Icelandic (Mac OS)", "MacIcelandic") // X-MAC-ICELANDIC
    
  Case Encodings.MacJapanese
    Return If(longName, "Japanese (Mac OS)", "MacJapanese") // Shift_JIS
    
  Case Encodings.MacKannada
    Return If(longName, "Kannada (Mac OS)", "MacKannada") // X-MAC-KANNADA
    
  Case Encodings.MacKhmer
    Return If(longName, "Khmer (Mac OS)", "MacKhmer") // X-MAC-KHMER
    
  Case Encodings.MacKorean
    Return If(longName, "Korean (Mac OS)", "MacKorean") // X-MAC-KR
    
  Case Encodings.MacLaotian
    Return If(longName, "Laotian (Mac OS)", "MacLaotian") // X-MAC-LAOTIAN
    
  Case Encodings.MacMalayalam
    Return If(longName, "Malayalam (Mac OS)", "MacMalayalam") // X-MAC-MALAYALAM
    
  Case Encodings.MacMongolian
    Return If(longName, "Mongolian (Mac OS)", "MacMongolian") // X-MAC-MONGOLIAN
    
  Case Encodings.MacOriya
    Return If(longName, "Oriya (Mac OS)", "MacOriya") // X-MAC-ORIYA
    
  Case Encodings.MacRoman
    Return If(longName, "Western (Mac OS Roman)", "MacRoman") // macintosh
    
  Case Encodings.MacRomanian
    Return If(longName, "Romanian (Mac OS)", "MacRomanian") // X-MAC-ROMANIAN
    
  Case Encodings.MacRomanLatin1
    Return If(longName, "Western (Mac Mail)", "MacRomanLatin1") // ISO-8859-1
    
  Case Encodings.MacSinhalese
    Return If(longName, "Sinhalese (Mac OS)", "MacSinhalese") // X-MAC-SINHALESE
    
  Case Encodings.MacSymbol
    Return If(longName, "Symbol (Mac OS)", "MacSymbol") // Adobe-Symbol-Encoding
    
  Case Encodings.MacTamil
    Return If(longName, "Tamil (Mac OS)", "MacTamil") // X-MAC-TAMIL
    
  Case Encodings.MacTelugu
    Return If(longName, "Telugu (Mac OS)", "MacTelugu") // X-MAC-TELUGU
    
  Case Encodings.MacThai
    Return If(longName, "Thai (Mac OS)", "MacThai") // TIS-620
    
  Case Encodings.MacTibetan
    Return If(longName, "Tibetan (Mac OS)", "MacTibetan") // X-MAC-TIBETAN
    
  Case Encodings.MacTurkish
    Return If(longName, "Turkish (Mac OS)", "MacTurkish") // X-MAC-TURKISH
    
  Case Encodings.MacVietnamese
    Return If(longName, "Vietnamese (Mac OS)", "MacVietnamese") // X-MAC-VIETNAMESE
    
  Case Encodings.ShiftJIS
    Return If(longName, "Japanese (Shift JIS)", "ShiftJIS") // Shift_JIS
    
  Case Encodings.SystemDefault
    Return If(longName, "SystemDefault", "SystemDefault") // macintosh (on macOS, don't know for Windows)
    
  Case Encodings.WindowsANSI
    Return If(longName, "Western (Windows Latin 1, CP-1252)", "WindowsANSI") // windows-1252
    
  Case Encodings.WindowsArabic
    Return If(longName, "Arabic (Windows)", "WindowsArabic") // windows-1256
    
  Case Encodings.WindowsBalticRim
    Return If(longName, "Baltic (Windows)", "WindowsBalticRim") // windows-1257
    
  Case Encodings.WindowsCyrillic
    Return If(longName, "Cyrillic (Windows, CP-1251)", "WindowsCyrillic") // windows-1251
    
  Case Encodings.WindowsGreek
    Return If(longName, "Greek (Windows, CP-1253)", "WindowsGreek") // windows-1253
    
  Case Encodings.WindowsHebrew
    Return If(longName, "Hebrew (Windows)", "WindowsHebrew") // windows-1255
    
  Case Encodings.WindowsKoreanJohab
    Return If(longName, "Korean (Windows Johab)", "WindowsKoreanJohab") // Johab
    
  Case Encodings.WindowsLatin1
    // this entry duplicates WindowsANSI (same code)
    Return If(longName, "Western (Windows Latin 1)", "WindowsLatin1") // window-1252
    
  Case Encodings.WindowsLatin2
    Return If(longName, "Central European (Windows Latin 2)", "WindowsLatin2") // windows-1250
    
  Case Encodings.WindowsLatin5
    Return If(longName, "Turkish (Windows Latin 5)", "WindowsLatin5") // windows-1254
    
  Case Encodings.WindowsVietnamese
    Return If(longName, "Vietnamese (Windows)", "WindowsVietnamese") // windows-1258
    
  Case Else
    Return "Not Found - " + enc.InternetName
    
  End Select
  
End Function

3 Likes

Note: WindowsLatin1 & WindowsANSI have the same code (1280), and SystemDefault returns 0 (MacRoman) on macOS. Not sure what it would say for Windows or Linux.

OK - your goal is valid and your intent true, but here is a much more compact and maintainable way to accomplish it. :slight_smile: I’ve taken your data and turned it into a module with two extension methods that deliver the same functionality as your code. Take a look and let me know what you think.

TextEncodingExtensions.xojo_binary_code.zip (3.1 KB)

1 Like

Thank you @Eric_Williams, using a Dictionary will ultimately be a faster lookup than my Select Case logic. I did consider it, but for my purposes I went with storing the data in a Module property (array of Pairs), to keep my list in a specific order and to also have MenuItem separators so I can populate a list of DesktopMenuItems dynamically, and a couple of DesktopPopupMenus as well (like in the screen-shot).

But, I thought I should do a performance test with your Dictionary approach, and got some unexpected results.

Please see my quick and dirty project and tell me what I’m doing wrong. Why are most entries in the Dictionary returning “Not Found”?
Encoding Name Test.xojo_binary_project.zip (10.0 KB)

I see that WindowsLatin1 is missing from your lists. Was that intentional?

The documentation says Encodings is a Module, and maybe I’m mistaken that it’s not a proper Enum (I just assumed). In which case, I wonder if the Dictionary index would work better using the Code property instead?

E.g., Encodings.ASCII.Code

Just a thought. Thank you for your feedback.

Edited to add screen-shot of Globals at runtime.

Interesting! I’ve fixed the project and attached it. The approach is a little clunky but still more efficient.

It seems that the Encodings module does not return the same object every time you ask for an encoding. So in this micro-example:

dim t1 as TextEncoding = Encodings.UTF32
dim t2 as TextEncoding = Encodings.UTF32

t1 is not guaranteed to be the same object as t2. In my first algorithm, the dictionary was indexed by object and thus failing when the objects were different.

Encoding Name Test.xojo_binary_project.zip (10.1 KB)

Thanks again for your feedback, Eric.

I guess I approached the logic from a different point of view when I first put together my Select Case lookup. I started with a similar bit of code found in @Kem_Tekinay’s M_String project and just expanded it to include the more formal descriptive names.

My apologies if my original wrong assumption of the Encodings module being an enum threw you off. With that said, I’m wondering if using the TextEncoding.Code property is still not a better choice for the Dictionary index than InternetName, given that the InternetNames are not completely unique. There are 6 names assigned to 2 or more Encoding entries each.

Anyway, I updated your original revision to use the Code property and the list of results show as accurately as my Select Case approach.

// sample
encodingToNameLookup.Value(Encodings.ASCII.Code) = "ASCII"
encodingToNameLookup.Value(Encodings.UTF8.Code) = "UTF8"
encodingToNameLookup.Value(Encodings.UTF16.Code) = "UTF16"
encodingToNameLookup.Value(Encodings.UTF32.Code) = "UTF32"
encodingToNameLookup.Value(Encodings.DOSArabic.Code) = "DOSArabic"
encodingToNameLookup.Value(Encodings.DOSBalticRim.Code) = "DOSBalticRim"
// ...

As well, one of the reasons I’m using the Code property (Integer) in my internal array of Pairs is because in at least one instance I need to store the Encoding choice in a database.

Here are some results of the speed differences using Dictionaries vs Select Case

Version Time to lookup and load ListBox Notes
Original Module Lookup 0.9574167 ms Most returns not found
Revised (array + Dictionary) using .InternetName 1.291292 ms Returns affected by duplicate index entries (InternetName)
Using .Code as index 1.030708 ms Accurate, with one exception (WindowsANSI vs WindowsLatin1)
Using Select Case 2.259292 ms Accurate, with one exception (WindowsANSI vs WindowsLatin1)
Running both Dictionary & Select Case together 2.8465 ms

Encoding Name Test v3.xojo_binary_project.zip (10.5 KB)

Also, as mentioned in my OP, WindowsANSI & WindowsLatin1 have the same code value (1280). Does anyone think this could be a bug?

Another question is, when the Xojo framework does its magic to assign/convert encodings - does it do it based on the InternetName or the Code value? Or by some other logic?

Thanks again.

I considered using TextEncoding.Code but it is (apparently) Mac-specific, and those values look very, very old – probably dating back to Mac OS 8 circa 1997ish – so while they probably aren’t going to change, they also might not be up to date.

If TextEncodings don’t give us access to a reliable unique identifier, that should definitely be a bug report or a feature request. But I’m pretty sure InternetName is unique enough despite your findings. My hunch is that (as noted below) some of these encodings are the same encoding, just renamed.

Incidentally - the ability to easily change the way the data is indexed is one advantage of the code structure I put together versus the Select Case structure. :slight_smile:

Each String has an internal attribute that indicates what the data’s encoding is. The actual implementation of this is not something you can observe or rely on, so you shouldn’t bother - it may even differ between Xojo versions, the operating system, etc.

Generally speaking, strings are assigned a text encoding when the string object is created, based on a contextual understanding of where the data is coming from. UI elements such as text fields will always provide you with a string that has the correct encoding applied. So this is something that you, the developer, will have to deal with if you are taking data from a source that does not indicate an encoding (a BinaryStream, for example). You do this via String.DefineEncoding.

However, if you are accessing data from a source that already provides an encoding, you should trust that encoding and not disturb it. Most strings provided by the Xojo framework will be in UTF8, for example.

The Dictionary is optimized for this sort of lookup so it will almost always win against If…Then or Select Case. TextEncoding.Code is faster than TextEncoding.InternetName because the dictionary can use the integer value of Code directly as an index instead of having to hash the string value of InternetName.

I doubt it. They are very likely the same encoding presented twice for people who are accustomed to each name.

Ah, you could be right. When I was researching the names, I often ended up in Apple’s documentation. I forgot about that.

Luckily for me, I only wish to develop for macOS.

I know. Good or bad, it was a conscious choice on my part to go with the Select Case in this particular use case. For me, I knew it would be a minor trade-off in performance vs. maintainability.

I started the post in hopes that someone could make use of the results more than the implementation. But thanks to your contributions Eric, folks now have that too, so thank you :nerd_face:

I wondered that, but when researching online, there are some contradicting answers.

Regardless, I have set my app to use UTF-8 as the default, of course. But I also want to offer alternate encodings. In the end, I may trim the list down once I find what works the best with Xojo and the Scintilla edit control and Binary/Text Stream functions. Time will tell.

Thanks again for your insight, Eric.

Well, it seems to me that Select Case is both slower AND more trouble to maintain… :wink: