BinaryStream Data Corruption

Denise_Adams · February 12, 2018, 1:58pm

Hi. I have a Windows user in Korea that is having trouble opening a binary project file my app creates. I use the code below to create and open and it works fine for everybody else but for some reason the data seems to be corrupted for him.

I have no trouble opening the data he saves on my Windows machine and have debugged various ways with him testing to see how it is getting corrupted but have no idea.

Could it be a LittleEndian / BigEndian issue?

// Save

FileData = ConvertEncoding(FileData, Encodings.UTF8)

    binary = BinaryStream.Create(Document, True)
  
  If binary <> Nil then
    binary.WriteInt32 lenB(FileData)
    binary.Write FileData
    binary.Close
  End If

// Open
binary = BinaryStream.Open(Document, False)

If binary <> Nil then
  
  Do Until binary.EOF
    i = binary.ReadInt32
    FileData = FileData + binary.Read(i)
  Loop
  
  binary.Close

FileData = DefineEncoding(FileData, Encodings.UTF8)
  
End If

Björn_Eiríksson · February 12, 2018, 2:03pm

Are you sure this here was valid?

FileData = ConvertEncoding(FileData, Encodings.UTF8)

As in did FileData have encoding at all before converting ? Or was it binary data or text without encoding ? (Either case then that would be the cause right there)

And no it could not be big vs little endian issue since that happens between Intel and PPC CPU’s and PPC CPU’s dont exist any more.

Denise_Adams · February 12, 2018, 2:19pm

Hi Bjorn. The FileData is UTF8 and says so in the Xojo Debugger before the ConvertEncoding call but I just added it before the file save just to be sure. Is it possible that a string could somehow accidentally have mixed encodings, perhaps mostly UTF8 and an unknown nil encoding string part of it but because the majority of it is UTF8 Xojo thinks it’s UTF8? If so, then the ConvertEncoding wouldn’t correctly work or would force the wrong encoding and cause the data corruption issue.

Jeff_Tullin · February 12, 2018, 2:41pm

Pretty sure Xojo always suggests UTF8 for internal string variables.
In Korea, if the user created the text, it could easily be in UTF-16

As far as I can tell, UTF-8 requires 3 bytes per Asian character, and UTF-16 only requires 2 bytes per Asian character.

DerkJ · February 12, 2018, 5:23pm

Did you set the right binarysteam.Littleendian = True/False?

Try littlendian = True

Greg_O_Lone · February 13, 2018, 11:20am

Dont do that without checking if its already UTF-8. You could actually cause issues for yourself. Wrap that code like this:

if filedata.Encoding <> Encodings.UTF8 then FileData = ConvertEncoding(FileData, Encodings.UTF8) End If

TomE · February 13, 2018, 12:09pm

[quote=373331:@Greg O’Lone]Dont do that without checking if its already UTF-8. You could actually cause issues for yourself. Wrap that code like this:

if filedata.Encoding <> Encodings.UTF8 then FileData = ConvertEncoding(FileData, Encodings.UTF8) End If[/quote]

huh? I thought it is a NOP when I convert UTF8 Text to UTF8
Can you elaborate?