I have a string that contains data from 0x00 to 0xFF … so some characters exceed the ASCII definition of 0x00 to 0x7f
However I need to get the value of the character regardless…
I had always used
but currently if the value exceeds 0x7F it returns ZERO
I’d prefer to NOT make a memoryblock as the incoming data can be several mega in size, and I have thousands of these to analyze as fast as possible
basically these values indicate the length of the following data field, and others indicate encoding style etc.
ASCB and ASC return the same result
ACTUALLY … ASC may be working… BUT using TextINputStream may be the issue
If I look at the file with a hex editor at one place I see this sequence
54 54 32 00 00 83 00 52 69
but when I read the file in it only had
54 54 32 00 00 00 52 69
the 83 vanished!
ASCB should simply give you value of first byte.
does Xojo remove invalid UTF-8 combinations?
Did you set text encoding for reading the file? Default is UTF-8.
the file(s) all contain mix of ASCII, UTF-8, UTF-16, and UTF-16BE
and each file can contain a mix of any or all of those
So I guess I have to convert to a binary stream
Binarystream may be better.
You may need to set encoding to nil or some 8 bit encoding like ISO Latin 1.
Than you can read all bytes and later use DefineEncoding on substrings to switch to whatever it is.
Strange this is… 90% of the data that is working contains characters >0x7F … including some with FF FE (UTF-16) and it is decoding it correctly…
But the 0x83 vanishes, and screws the pointer up… since it is all based on offsets
That was easy… I did not realize you could interchange MemoryBlock and String
Where I was reading FILEBUFFER as STRING=TEXTINPUTSTREAM
I changed it to
FILEBUFFER AS MEMORYBLCOK = BINARYSTREAM
and the rest of the code remained the same. and now it works!
Thanks Christian for point me in the right direction at least