POP3 Character Encoding?!?

So, I’ve converted all strings to UTF-8, which supports glyphs…but the read Subjects are not properly displaying within a listbox.

Here’s 3 examples of invalid characters appearing…

— is being shown, when should be shown.

:heart: is being shown, when ? should be shown.

® is being shown, when should be shown.

So my question is, how do I get the correct characters to show?

The encoding of a mail has nothing to do with the transport (POP3 or IMAP). The subject is encoded in a special form, which you simply need to parse. An example:

f=?ISO-8859-1?Q?=FC?=r

where you have =? charset ? encoding ? encoded text ?=.

That part has already been done as

[code] if testSubject = “=?UTF-8?B?” then
Subject = DecodeBase64(Subject.NthField(testSubject,2).NthField("?=",1))
end if

if testSubject = “=?utf-8?Q?” then
Subject = DecodeQuotedPrintable(Subject.NthField(testSubject,2).NthField("?=",1))
end if[/code]

along with Koi8-r (russian) and all the rest…

The rest of the subjects do not have any charsets…rather plain strings containing special characters (ie. “I am the subject❤”). The strings when read, contain the incorrect characters as in the first post, but when converting the encodings, to ensure they are in UTF-8, when it is undefined and merely a string, still have the incorrect characters. I could replace each invalid character(s) with the correct ones manually, but there should be an easier way as ConvertEncoding(string, Encodings.UTF8) which does not work. I understand how the subjects and encoding work already, they are just not working as expected, and I’m wondering if the class is ‘breaking’ the subject encodings before they are read into a string in which case no amount of converting will fix the string (which I hope is not the case).

With the case of the heart issue…that is encoded as such

=?UTF-8?Q?You’ll_be_=E2=9D=A4in’_this!?=

…but when using DecodeQuotedPrintable, and convert to UTF-8, the wrong glyph is shown.

=E2=9D=A4 should equal ?, not :heart:

so when unencoded,

=?UTF-8?Q?You’ll_be_=E2=9D=A4in’_this!?=

should be

You’ll be ?in’ this!?

but instead we get…

You’ll_be_❤in’_this!?

:-/

When I view the data incoming to the pop3socket, it is not the same data that comes out of the pop3socket… so I need to know how to fix that issue.

:frowning:

You first need to use DefineEncoding to tell Xojo what the actual encoding is. Then you can convert it with ConvertEncoding to UTF8.

... = aString.DefineEncoding(Encodings.ASCII).ConvertEncoding(Encodings.UTF8)

Replace ASCII with whatever necessary.

[quote=202468:@Eli Ott]You first need to use DefineEncoding to tell Xojo what the actual encoding is. Then you can convert it with ConvertEncoding to UTF8.

... = aString.DefineEncoding(Encodings.ASCII).ConvertEncoding(Encodings.UTF8)

Replace ASCII with whatever necessary.[/quote]

:slight_smile: I knew it was a simple solution. Thanks Beatrix and Eli Ott for the help. Those 12 hr work days really make some interesting “slap forehead” moments towards the end. Back to the office I go :slight_smile: Again, thanks to you both.