My app parses mail: get ContentTransferEncoding and ContentType, decode mail, apply encoding and convert to UTF8. Which works mostly. Now I’ve got a hebrew html mail, where ConvertEncoding makes an empty string. Xojo 2014r2 and 2015r2. Mac OS 10.10.5.
dim f as FolderItem = GetOpenFolderItem("")
dim b as BinaryStream = BinaryStream.Open(f)
dim s as String = b.Read(b.Length)
s = DecodeQuotedPrintable(s)
dim theEncoding as TextEncoding = GetInternetTextEncoding("iso-8859-8")
s = DefineEncoding(s, theEncoding) 'string shows okay here as hebrew
s = ConvertEncoding(s, encodings.UTF8) 'string empty
The first to links in Google give you ISO-8859-8-I and ISO-8859-8. I just quickly read them (so I’m not really certain I understand it correctly), but ISO-8859-8 seems to be in logical order (left-to-right) and ISO-8859-8-I in visual order (right-to-left).
I wonder why it should make any difference if the Unicode code point is shown as UTF-8, UTF-16, UTF-32 or whatever? These encodings are just different ways to point to the same Unicode character, or do I miss something?
@Greg: will do some testing. The code is part of a large parsing algorithm. I get heartburn when I think about converting this to text. Not going to happen soon. Also the data comes in without encoding and is given encoding only as very last step.