BUG on TextOutputStream when using Encodings and its workaround

Ok, so we have a solution.

The online documentation now mentions that UTF-8 is the default encoding (not unter TextOutputStream.Encoding but just in the examples section), and it no longer recommends to use ConvertEncoding prior to writing text to a TextoutputStream but advises to set the Encoding property accordingly. But the offline docs in Xojo 2021 r1.1 still suggest to use ConvertEncoding which is bound to fail for every encoding but UTF-8.

1 Like

I am also having troubles writing an RTF-file with special Characters like ä, ö, ü. The quoted code does not do it for me using Xojo 2020 r2.1. The file always is UTF-8 encoded, no matter if I convert Encodings to WindowsANSI, ASCII or whatever.

The interesting part is, that UTF-8 normally is capable of those special characters. But in this case it gives out something like: ä, ö, ü.

Any ideas how to fix this?

Don’t use Windows.ANSI, use UTF8 ?

Those characters cannot work in WindowsANSI
I am always amazed at people who use software which demands ANSI (usually business software that is still clinging to COBOL or DOS beginnings) , yet complain that they don’t get accented characters
8bits is 8 bits.

The thing that throws such software most is the presence of a BOM at the start of the file.
A file that on the surface begins with Hello World actually has a few control characters before the H and messes everything up.

UTF8 works internally in Xojo:

Var s as String = "ä, ö, ü, ?"
Var enc As TextEncoding

s = s.DefineEncoding(Encodings.UTF8)
enc = s.Encoding
system.DebugLog "UTF8 = "+s  //outputs: UTF8 = ä, ö, ü, ?

s = s.DefineEncoding(Encodings.ASCII)
enc = s.Encoding
system.DebugLog "ASCII = "+s  //outputs: ASCII = ä, ö, ü, ?

s = s.DefineEncoding(Encodings.MacRoman)
enc = s.Encoding
system.DebugLog "MacRoman = "+s  //outputs: MacRoman = ä, ö, ü, ?

s = s.DefineEncoding(Encodings.WindowsANSI)
enc = s.Encoding
system.DebugLog "WindowsANSI = "+s  //outputs: WindowsANSI = ä, ö, ü, ?

Now if I write the UTF to File, the output is: ä, ö , ü ?
And in ASCII: ä, ö, ü ?
So the same characters as internally with WindowsANSI?!

Changing the RTF header does also show no effect:
{\rtf1\ascii\ansicpg1252\cocoartf1561\cocoasubrtf600 or
{\rtf1\ansi\ansicpg1252\cocoartf1561\cocoasubrtf600 or
{\rtf1\utf-8\ansicpg1252\cocoartf1561\cocoasubrtf600

Oh, it might be the problem, that RTF does not support ASCII or UTF8. In the documentation it says:

RTF uses the ANSI, PC-8, Macintosh, or IBM PC character set to control the representation and formatting of a document

you should “ConvertEncoding” when there is an encoding already set to the string. Not just “DefineEncoding”, that should ONLY be done when the encoding of the string is Nil.

1 Like
TextField1.Value

What Xojo version are-you using ?
Have you read the Encoding property from here:
https://documentation.xojo.com/api/files/textinputstream.html#textinputstream-encoding

This code save the character as Windows1252 (on a m1 Mac):


Var Encoded_Str As String
Encoded_Str = ConvertEncoding(TextField1.Text, Encodings.WindowsANSI)
output.Write(Encoded_Str) '<- The particular encoding (only change)

ASCII characters can have a value from 0 to 127. Characters where you got troubles with are not ASCII. They are something else.

UTF8 values 0 thru 127 are the same characters you will find in ASCII.

I have hard times now to read on screen and have to stop.

1 Like

No need for a BOM in a UTF-8 file.

1 Like

I agree.
So when they do show up…

1 Like

Holymoly dear Emile, that worked! Thank you so much!

If this code really surprises you then you should check out the section on encodings in the docs. If you already did that do it again.

3 Likes

The next time you have something strange, do the same: put one line operation instead of two or three…

I do not know why, but often it worked for me.

I started to do that to be able to watch what›’s in the variable… :wink:

:man_facepalming:t2::rofl:

As already told you in the other thread, Unles you are implementing the WHOLE RTF standard by yourself, ading nonsense random headers to a text file doesn magically convert it in RTF

As I already toldyo in the other thread, your problem is NOT with the encodings, I gave you the answer on your other thread but you dismissed :man_facepalming:t2:

It is sad, I gave him the link to the exact class to use, that contains an example to do what he wants and he did not even read it and said that it was not the answer just by the name of the class :rofl::roll_eyes:

Hi Ivan, but the code works now, and it was an encoding issue. I can tell because I only tweaked the encoding of the string and the write file – trial and error – and with the great support of Emile Schwarz. And that was the cure. So unless I changed something else without noticing, I can say, it was a matter of encoding.

That is sad. if you want to call RTF to a bad encoded text file that has some rtf headers but did not follow the standard and will have al sorts of problems because of that, serve yourself :man_shrugging:t2: