Unicode Encoding UTF16 (again)

Hello everyone,

A few weeks I had the same problem with encodings. However, then I did not know the encoding of those language.lng files.

Now I know the encoding is UTF16.

Please read the following code in the “DropObject” event handler of a listbox :

  // This event handler (DropObject) opens the file,
  // read the file line by line,
  // and place every line in a row.
  // The left side text from the "=" sign
  // will be separated by the right side text.
  
  Dim f_Temp As FolderItem
  Dim tisTemp As TextInputStream
  Dim strRecord As String
  Dim strContent(1) As String
  Dim i As Integer
  
  If obj.FolderItemAvailable Then
    // FolderItem is available, Proceed
    f_Temp = obj.FolderItem
    // Making sure all rows are removed.
    me.DeleteAllRows
    If f_Temp.Exists Then
      // Now we are sure f_Temp exists
      // Opening the file
      tisTemp = TextInputStream.Open(f_Temp)
      While not tisTemp.EOF
        strRecord = tisTemp.ReadLine
        If Len(Trim(strRecord)) = 0 Then
          // Line is emtpy, leave the present row empty
          // and create another new one.
        Else
          For i = 0 To 1
            strContent(i) = NthField(strRecord, "=", i + 1)
            strContent(i) = DefineEncoding(strRecord, Encodings.UTF16BE)
          Next i
          me.AddRow strContent(0)
          me.Cell(me.ListIndex, 1) = strContent(1)
          me.Cell(me.ListIndex, 1) = DefineEncoding(strRecord, Encodings.UTF16BE)
        End If
      Wend
      tisTemp.Close
    Else
      MsgBox f_Temp.NativePath + " does not exists!"
    End If
  Else
    MsgBox "There is a problem with this file"
  End if

Like you can see UTF16BE comes close but not close enough. All other UTF values fail.

The original file comes from China. UTF16 gives me chinese characters. UTF32 in any form gives me nothing. UTF8 also fails.

When the rows are added to the listbox, only the heading on the left of the “=” signs are added, the right side is empty. Also it seems that on several places but not all, the “=” sign is not recognised because the text on the right side of the “=” sign is also shown in the left column of the listbox.

TopStyle tells me the original file is UTF16 and shows it perfectly.

I do not know why Xojo still does not show the file correctly. I think I made every precaution that the encoding UTF16 is respected not?

Any idea where my thinking is faulty? I obviously do something wrong, but do not know what.

Thank you very much for your time and efforts to help me.

Friendly greetings,

Chris

One last thing I forgot.

“me” in the code is pointing to a listbox.

Sorry that I forgot it to mention.

I am far from an expert on encodings, but looking at your code I can see that you split your data with a readLine and an Nth Field BEFORE defining the encoding. If I were doing this, I would define the encoding of the data before parsing it in any manner.
Also, depending on the size of your data, you may find it faster to “Split” your data into a string array then iterate over each element of the array.

Thank you Roger for your information.

I found the problem. This is to read the correct encoding from the external file :

strRecord = tisTemp.ReadLine(Encodings.UTF16)

I just changed that one line and removed all other encoding lines and it just worked out fine. I am back in business now.

Thank you very much for your answer Roger, you put me on the correct feet again.

Wish you a very nice day and all the best.

Friendly greetings,

Chris