Var title_value As String = rs.Column("title").StringValue // break here, what's the encoding?
title_value = title_value.DefineEncoding(Encodings.ISOLatin1)
Self.Title = title_value // Break this line check the debugger encodings
Var title_value As String = rs.Column("title").StringValue // break here, what's the encoding?
// Break: Encoding is NIL, because it's not yet defined by my code
title_value = title_value.DefineEncoding(Encodings.ISOLatin1)
// Break: Still has no encoding here
Self.Title = title_value // Break this line check the debugger encodings
// Break: Also no encoding... strange!
``.
I think iâve found the issue: If the encoding is wrong (f.e. Latin1 instead of UTF8), it does not define the desired encoding but an unknown/broken encoding?
DefineEncoding is of course in no way conversion of any kind. Its only a promise, in this case you are promising the encoding is ISPLatin1. If the promise is false then you can expect crash or any undefined behavior basically.
Presumably it just sets a value indicating the encoding somewhere in the String object. It doesnât do anything to the stringâs data bytes. ConvertEncoding OTOH will assume your string has the given encoding and then converts it (and also changes that encoding value to the new one).
So itâs up to us to ensure our text actually has the defined encoding.
The really âbadâ thing seems to be in my case, that the DB,Table and Column are defined as latin1 names + charset, but a PHP Script (of which i have no control over) saves the data as UTF8 encoded data but my app uses the correct encoding (latin1) to save the data into the DB.
When my app reads data from the DB, it canât âknowâ if itâs latin1 or UTF8 encoded.
Unfortunately i canât alter the DB schema. And itâs a DB many different Systems have read/write access to.
I donât want to âbreakâ something i canât handle afterwards.
I am not sure you can actually fix when its been done fundamentally wrong under the hood. You might be able to get it somewhat ok, but the heart of the problem would always lure over you as in what if you get different combination of letters or different symbol which you had not tested for, then you always have the problem that UTF8 was forced into IsoLatin1 and your at mercy of the database, how bad or well it handles that.
Exactly. ATM iâm pulling those VarChars as UTF8 encoded and if my Code âdetectsâ strange/alien chars, it pulls those chars again as latin1. Then it moves on to the next DB Column.
Not perfect but is good enough for our usecase.
mySQL (i think v8.x). Itâs an outdated OTRS Ticket System DB.
I wrote a Software for our Company which combines various Systems like OTRS, selectLine,Zabbix and our own Wimax/Ubiquity Network Solutions with our Fiber and Copper DSL Switches and Servers. And soooo much moreâŠ