Trouble with encodings

In my example i want to write a string with one Swedish character “ö” to a binary file and then read it back but it dosn’t work correctly. When i read it back. Instead of “Impössible” i got “Imp??ssible” witch mean that the conversion didn’t work.
In the datafile the string is correct. I tried it without the encodings but the result is the same. What do i wrong?

In the XOJO documentation i can read that i didn’t need to use any conversion when i create and use the file with XOJO.

Dim F1 As FolderItem
Dim File1 As BinaryStream
dim s as string

'Create and save data.
F1= GetFolderItem(“Customer.txt”)
If F1 <> Nil Then
File1 = BinaryStream.Create(F1, True) // Overwrite if exists
File1.Write(ConvertEncoding(“Impössible”, Encodings.UTF8))
End If

'Now the string “Impössible” is in the Customer.txt file
'and the string is correct.

'Open and read data
F1= GetFolderItem(“Customer.txt”)
If F1 <> Nil Then
File1 = BinaryStream.Open(F1, False)
End If

Try Windows ANSI

when you open and read the file, try:

... S=File1.Read(File1.Length, Encodings.UTF8) ...

You need to use DefineEncoding, not ConvertEncoding:

[code] Dim fi As FolderItem = GetFolderItem(“Customer.txt”)

If fi <> Nil Then

Dim bs As BinaryStream = BinaryStream.Create(fi, True)

bs = BinaryStream.Open(fi, False) 
Dim s As String = bs.Read(fi.Length).DefineEncoding(Encodings.UTF8)


String literals in Xojo are UTF8, so there is no need to convert upon writing to the file.
When reading from the file, you first use DefineEncoding and then – if applicable – ConvertEncoding. For example when reading from a file with ISOLatinGreek encoding you would do the following to get it as Xojo string encoded as UTF8:

Dim s As String = bs.Read(fi.Length).DefineEncoding(Encodings.ISOLatinGreek).ConvertEncoding(Encodings.UTF8)

Only when you use TextInputStream / TextOutputStream. When using BinaryStream, you need to set the encoding.

Any special reason why you use BinaryStream for I/O ?

Thanks all of you for so quick answers. Eli Ott gave me the right answer… maybe the others answers are correct also.


But i don’t understand why i have to use that method.
In the documentation under DefineEncoding it says…
“The encoding of all strings created in your application is UTF-8, so you don’t have to use DefineEncoding on them.”

I created the file in XOJO in OS X and i read it back with the same application on OSX.

You are mistaken. String literals are always UTF-8, regardless of what you do with them.

[quote=288749:@Ossian Malm]But i don’t understand why i have to use that method.
In the documentation under DefineEncoding it says…
“The encoding of all strings created in your application is UTF-8, so you don’t have to use DefineEncoding on them.”

I created the file in XOJO in OS X and i read it back with the same application on OSX.[/quote]
Once you write it out, it is no longer a “string created in your application”. It is now “data that came from someplace else”, namely, from a file on disk. All ties to the string originally created in your application have been severed. It’s a completely new string that Xojo knows nothing about.

This is not correct:

bs.Write("Impössible")   // "Impössible" is already UTF8

[quote=288749:@Ossian Malm]But i don’t understand why i have to use that method.
In the documentation under DefineEncoding it says…
“The encoding of all strings created in your application is UTF-8, so you don’t have to use DefineEncoding on them.”

I created the file in XOJO in OS X and i read it back with the same application on OSX.[/quote]
You misunderstand the docs: strings created in your application means string within the application. Literal strings like Dim s As String = “Impössible”. The string you read back is not a string created in Xojo – it is read from a file, so you need to tell Xojo what encoding it has.

Okej. Thanks Tim and Ott. Thats explained a lot for me. I believed that the encoding information in a binary streams string was the same as in a TextStream.


They are the same. Meaning there is no encoding information in the file in either case. You must supply the encoding when you read the string back into your app.