Trouble reading files (incorrect encoding)

I am having a weird issue where I cannot read from a text file. The text file has it’s own file extension (if that could be the problem).

        dim t as TextOutputStream = TextOutputStream.Create(f)
        t.Write(ConvertEncoding(text, Encodings.UTF8))
        t.Close

I wrote something like this for reading the text file:

  dim t as TextInputStream
  try
    t = TextInputStream.Open(f)
msgbox t.ReadAll
  catch IOException
  end try
  
  if t <> nil then
    t.close
  end if

I have tried it without converting the encoding and just leaving it at default (which presume is UTF8 anyway). When I try to open the text file it gives reads the file very incorrectly BUT if I open it in NotepadPlus (a text editor) it reads it fine but tells me the encoding is ‘UCS-2 LE w/o BOM’ or something like that.

I then used the text editor to convert it to UTF-8 and when I opened it in my program it worked fine.

Thanks

Try:

msgbox t.ReadAll.DefineEncoding(Encodings.UTF8)

[quote=121904:@Eli Ott]Try:

msgbox t.ReadAll.DefineEncoding(Encodings.UTF8)

Thanks

[quote=121904:@Eli Ott]Try:

msgbox t.ReadAll.DefineEncoding(Encodings.UTF8)

Nope. Sorry, having the same problem.

[quote=121904:@Eli Ott]Try:

msgbox t.ReadAll.DefineEncoding(Encodings.UTF8)

I think writing the file that is wrong rather than reading it. As I said, if I setup the encoding externally, it reads it fine.

Thanks

You can set encoding property of the stream instead of using defineEncoding function.

I do not understand your suggestion?

Thanks

http://documentation.xojo.com/index.php/TextInputStream.Encoding

The docs says that the default is UTF-8 so I want to read and write with UTF-8 encoding and I am having trouble. I have tried setting the encoding before reading from the text file and it does not work. As I said ‘I think writing the file that is wrong rather than reading it.’.

Thanks

The actual and expected text (as expected to be seen on the display by user) and the hex data of the source may help to find the root cause.

Example: If text in a file is ASCII encoded and the text in the file is ABCDEF (as expected to be seen on the display by user) then the hex data in the file will be 41 42 43 44 45 46 (six octets). Please also see this.

Where does the text file come from ?

It comes from a folder in the app directory called ‘Plugins’. I have been adviced not to place this folder there but I am just debugging (so that should not be a problem) and I will change this after releasing my software so that is not a big priority for me right now.

The text file is saved by my program and opened by my program. So I am opening a text file saved by my program.

Thanks

Maybe the problem is in saving the file ; UCS-2 does exist Universal Coded Character Set - Wikipedia so you want to check what could happen when you save the file to give it this unusual encoding. Have you been using code very different from the examples at http://documentation.xojo.com/index.php/TextOutputStream ?

[quote=122000:@Michel Bujardet]
Oliver Scott-Brown The text file is saved by my program and opened by my program. So I am opening a text file saved by my program.
When I try to open the text file it gives reads the file very incorrectly BUT if I open it in NotepadPlus (a text editor) it reads it fine but tells me the encoding is ‘UCS-2 LE w/o BOM’ or something like that.
Maybe the problem is in saving the file ; UCS-2 does exist Universal Coded Character Set - Wikipedia so you want to check what could happen when you save the file to give it this unusual encoding. Have you been using code very different from the examples at http://documentation.xojo.com/index.php/TextOutputStream ?

Like

Write a reply…
Go to top Powered by esoTalk 17 online[/quote]
Nope. My code is very similar to this example ‘Writing to a new text file’:

Dim t As TextOutputStream
Dim f As FolderItem
f = GetSaveFolderItem("", “CreateExample.txt”)
If f <> Nil Then
t = TextOutputStream.Create(f)
t.WriteLine(TextField1.Text)
t.Close
End If

Thanks

[quote=121902:@Oliver Scott-Brown]I am having a weird issue where I cannot read from a text file. The text file has it’s own file extension (if that could be the problem).
[/quote]

I tried your code with a regular text file as origin. Not a single problem.

I strongly suspect the string text to be the cause of your problem.

Try

msgbox Encoding(text).internetName

[quote=122008:@Michel Bujardet]I tried your code with a regular text file as origin. Not a single problem.

I strongly suspect the string text to be the cause of your problem.

Try

msgbox Encoding(text).internetName

It messages UTF-16. Thanks

Figures…

[quote=122008:@Michel Bujardet]I tried your code with a regular text file as origin. Not a single problem.

I strongly suspect the string text to be the cause of your problem.

Try

msgbox Encoding(text).internetName

Thanks. I got it working. I set the Encoding to UTF-16 prior to opening the file. I understand why the encoding conversion may not have worked also. Would you recommend I convert the encoding or does the encoding really matter as a long as I can save and open text files? I do not know the differences (for example use, speed, compatibility (with Win, Mac or Linux), file size and whatever else there is to consider).

Thanks a ton