XOJO 2019R3 Encodings Issues

  1. ‹ Older
  2. 2 weeks ago

    Sascha S

    Jan 13 Pre-Release Testers, Xojo Pro Germany, Lower Saxony
    Edited 2 weeks ago

    I can't see an issue in your code.

    But i see you are reading data from the database as ISOLatin1 and UTF8. Then you combine them using replace statements.

    What happens if you convert f+lName to UTF8 before you combine them in an UTF8 string?

  3. Aurelian N

    Jan 13 Pre-Release Testers, Xojo Pro

    @Markus R TIStream.Encoding = encodings.windowsLatin1

    Well I believe that that was the

    TIStream.Encoding = encodings.windowsLatin1

    part for if I'm not mistaking and it was always working until R3 so either something was broken before or something is broken now. I'll try to do a fresh document and try again to see what result I get.

  4. Michael H

    Jan 13 Pre-Release Testers, Xojo Pro Europe (Hamburg, Germany)

    @Aurelian N So far for me it seems that ConvertEncoding is not working at all, no matter what I put there I get always same result, or at least this is how it looks.

    This is usually due to the string’s encoding not being properly defined. ConvertEncoding can only convert from some known encoding into another. The general rules are actually quite simple: Whenever you fetch some text (like from a database) you define its encoding. Whenever you export some text you convert its encoding (if necessary) to whatever encoding is expected. Everything else is taken care of automagically.

  5. Aurelian N

    Jan 13 Pre-Release Testers, Xojo Pro

    @SaschaSchneppmueller I can't see an issue in your code.

    But i see you are reading data from the database as ISOLatin1 and UTF8. Then you combine them using replace statements.

    What happens if you convert f+lName to UTF8 before you combine them in an UTF8 string?

    Well apparently the data supposed to be UTF8 and it was always like that but I did put Latin1 to see if maybe that is the issue as we had that in the past when importing from another app and when I did define the encoding as Latin1 I did not get the weird characters in the IDE Debug part while if I do replace the ISOLatin1 with UTF8 I get

    St�hanie

    in the debug window so that seems that some data is Latin1 as I might suspected and only when we get it from another app, but still ConvertEncdoding supposed to Convert it properly once the data was correct or at least supposed to look correct . So my understanding was I get the data Latin1 I know it is Latin1, I convert it to UTF8 and I work with it, maybe I'm doing it wrong in the process. While Define works well convert does not seem to have any effect .

  6. Aurelian N

    Jan 13 Pre-Release Testers, Xojo Pro

    @Michael Hszlig;mann This is usually due to the string’s encoding not being properly defined. ConvertEncoding can only convert from some known encoding into another. The general rules are actually quite simple: Whenever you fetch some text (like from a database) you define its encoding. Whenever you export some text you convert its encoding (if necessary) to whatever encoding is expected. Everything else is taken care of automagically.

    Well that was the purpose of the tests that I did earlier, as I mentioned in the previous post, some fields were ISOLatin1 so I define them that way, it was showing well in the interface and debug, then converted them in UTF8 and it should do the job but apparently it does not, no idea honestly where else to look and I checked all the code and all the documentation so apparently all is ok but still does not work.

  7. Sascha S

    Jan 13 Pre-Release Testers, Xojo Pro Germany, Lower Saxony

    Can you please try the following?

    lName = row.Column("lName").StringValue.DefineEncoding(Encodings.ISOLatin1)
    lName = lName.ConvertEncoding(Encodings.UTF8)

    fName = row.Column("fName").StringValue.DefineEncoding(Encodings.ISOLatin1)
    fName = fName.ConvertEncoding(Encodings.UTF8)

    fContent=fContent.ReplaceAll("<#PGender#>", row.Column("gender").StringValue.DefineEncoding(Encodings.UTF8))
    fContent=fContent.ReplaceAll("<#PlName#>", lName.ConvertEncoding(Encodings.UTF8))
    fContent=fContent.ReplaceAll("<#PfName#>", fName.ConvertEncoding(Encodings.UTF8))

  8. Sascha S

    Jan 13 Pre-Release Testers, Xojo Pro Germany, Lower Saxony

    @Aurelian N so that seems that some data is Latin1 as I might suspected and only when we get it from another app

    This "other App" is writing ISOLatin1 (or similar) into a UTF8 defined db field. Which is totally possible with SQL.

  9. Aurelian N

    Jan 13 Pre-Release Testers, Xojo Pro

    @SaschaSchneppmueller This "other App" is writing ISOLatin1 (or similar) into a UTF8 defined db field. Which is totally possible with SQL.

    Apparently yes and it was supposed to be fixed to be all UTF8 so I'll have to check to that side as well .

  10. Aurelian N

    Jan 13 Pre-Release Testers, Xojo Pro

    @SaschaSchneppmueller fContent=fContent.ReplaceAll("<#PlName#>", lName.ConvertEncoding(Encodings.UTF8))
    fContent=fContent.ReplaceAll("<#PfName#>", fName.ConvertEncoding(Encodings.UTF8))

    When I do that , the whole word disappears, so nothing is showing anymore in the document the double converting part.

  11. Sascha S

    Jan 13 Pre-Release Testers, Xojo Pro Germany, Lower Saxony

    I guess fContent goes into an unknown encoding state if you combine it with different encodings.

  12. Aurelian N

    Jan 13 Pre-Release Testers, Xojo Pro
    Edited 2 weeks ago

    @SaschaSchneppmueller {\rtf1\adeflang1025\ansi\ansicpg1252\uc1

    Well I did tried
    TIStream.Encoding = encodings.windowsLatin1 ' Imported from some WindowsGeneratedRTF File and always worked this way until now. fContent = TIStream.readAll.ConvertEncoding(Encodings.UTF8) but with the same result.

  13. Sascha S

    Jan 13 Pre-Release Testers, Xojo Pro Germany, Lower Saxony

    Forget my code. I did not see somehow that you already convert f+lName to UTF8 before you merge it into fContent ...
    I have to leave this conversation for now, but hope someone else can help you. :)

  14. Aurelian N

    Jan 13 Pre-Release Testers, Xojo Pro
    Edited 2 weeks ago

    Well Apparently I did let only the

    fContent = TIStream.readAll.ConvertEncoding(Encodings.UTF8)

    to get all the text as UTF 8 and then I did a method

    Method Name ToUTF8Text  Extends s As String Return Type String
    
    Var result As String
    
    If s.Encoding = Nil Then
      result = DefineEncoding(s, Encodings.UTF8)
      
    Else
      result = ConvertEncoding(s, Encodings.UTF8)
      
    End If
    
    Return result

    So apparently the XOJO IDE was creating confusion for me as now after all this I get

    St�hanie

    as name but in the printing file it shows correct so no idea what to think now , I guess I have to ignore the IDE debug side as it can be deceiving

    and I did disabled as well the Encoding on Reading the file and Writing the file.

  15. Michael H

    Jan 13 Pre-Release Testers, Xojo Pro Europe (Hamburg, Germany)

    That would work only if the original file was UTF8 encoded to begin with, but not in any other case. As I said: define the encoding when you read text and convert to another encoding (if required) when you save it. You may convert everything to UTF8 after you have defined the encoding (if the encoding isn’t UTF8 to begin with) to streamline internal string handling but that isn’t critical; characters come out right internally regardless of the encoding, provided the encoding is known and properly defined.

  16. Aurelian N

    Jan 13 Pre-Release Testers, Xojo Pro

    @Michael Hszlig;mann That would work only if the original file was UTF8 encoded to begin with, but not in any other case. As I said: define the encoding when you read text and convert to another encoding (if required) when you save it. You may convert everything to UTF8 after you have defined the encoding (if the encoding isn’t UTF8 to begin with) to streamline internal string handling but that isn’t critical; characters come out right internally regardless of the encoding, provided the encoding is known and properly defined.

    Well that was the point , I dismissed the text input stream encoding as I have no idea what is it and what it was as they tend to create in Windows and on Mac those templates and when I read the raw rtf data I converted all that in UTF8 and I start working from there, and apparently it works, not perfect fix but for the moment it works, hopefully I'll dismiss soon the rtf part and work with docx directly and done.

  17. Hi there
    encoding with the German special characters ÄÖÜ has UFT8
    a length problem in the storage because the special characters are saved as 2 bytes. The display in the text is correct.
    I now use memory block with leftb () so as not to overwrite my fixed memory size.
    Example Test ÄÖÜ (8 bytes in the display) has a length of 11 bytes
    a string.left (8) does not work, 11 bytes remain when saving

  18. Emile S

    Jan 13 Europe (France, Strasbourg)
    Edited 2 weeks ago

    @Rudolf Jackel string.left (

    Rudolf: That is why String.LeftBytes exists. Try with that.

    String.LeftBytes

    Explanation:
    String.Left returns the umber of characters (what you see on screen)
    String.LeftBytes returns the number of Bytes the characters occupy in memory.

or Sign Up to reply!