We have a software that for example doesn’t recognize the "-Text-Delimiters while importing CSV to a database and simply imports them, resulting in Strings that begin and end with ". Also, the source-delimiter is “,” instead of “;”. So i began a more flexible CSV-to-CSV-Converter which changes field-delimiters and removes the text-delimiters from the fields (if a field-value has a field-delimiter in it, it is replaced so that the number of fields stays correct).
The problem is now, that I have to know the encoding of the input-file before I can analyze it, so I included Kem’s wonderful M_String-Modules and tried it the following way (this is the Method to load the source file and split it’s lines into an array for further analysis):
Dim Quellfile As TextInputStream
ReDim QuellZeilen(-1)
Dim s As String
If f <> NIL And f.Exists then
Quellfile = TextInputStream.Open(f)
s = Quellfile.ReadAll
s = ReplaceLineEndings(ConvertEncoding(DefineEncoding(s,M_Encoding.ByAnalysis(s,false)),Encodings.UTF8),EndOfLine)
QuellZeilen = Split(s,EndOfLine)
Quellfile.Close
MsgBox(Str(QuellZeilen.Ubound+1)+" Zeile(n) gelesen")
else
MsgBox("Fehler beim Lesen der Quelle")
end if
This works with a CSS-File which I got from an iPhone-App and which has a BOM.
But the original file I got from our customer is from a windows-machine and has no BOM, and when I try to load that one, the app crashes. I tracked it down in the debugger at the point where the ByAnalysis-Method calls “Encodings.UTF32BE.IsValidData(src)”.
I then inserted a block
if Encodings.UTF32BE.IsValidData(s) then
MsgBox "OK"
else
MsgBox "KO"
end if
after the Quellfile.ReadAll and now it is crashing at the first “if…”-line. I tried it with different files but it seems the UTF32BE.IsValidData crashes on every single file that is not valid instead of just returning false. I’m developing unter Windows 7 Pro 64bit and I have tried with Xojo 2014 r2.1 and r3.2.