Value of a character byte

DaveS · October 18, 2015, 6:38pm

I have a string that contains data from 0x00 to 0xFF … so some characters exceed the ASCII definition of 0x00 to 0x7f

However I need to get the value of the character regardless…

I had always used

x=ASCB(MID(theString,ptr,1))

but currently if the value exceeds 0x7F it returns ZERO

I’d prefer to NOT make a memoryblock as the incoming data can be several mega in size, and I have thousands of these to analyze as fast as possible

basically these values indicate the length of the following data field, and others indicate encoding style etc.

ASCB and ASC return the same result

DaveS · October 18, 2015, 6:44pm

ACTUALLY … ASC may be working… BUT using TextINputStream may be the issue

If I look at the file with a hex editor at one place I see this sequence

54 54 32 00 00 83 00 52 69

but when I read the file in it only had

54 54 32 00 00 00 52 69

the 83 vanished!

Christian_Schmitz · October 18, 2015, 6:45pm

ASCB should simply give you value of first byte.

Christian_Schmitz · October 18, 2015, 6:45pm

does Xojo remove invalid UTF-8 combinations?
Did you set text encoding for reading the file? Default is UTF-8.

DaveS · October 18, 2015, 6:47pm

the file(s) all contain mix of ASCII, UTF-8, UTF-16, and UTF-16BE
and each file can contain a mix of any or all of those

So I guess I have to convert to a binary stream

Christian_Schmitz · October 18, 2015, 6:49pm

Binarystream may be better.

You may need to set encoding to nil or some 8 bit encoding like ISO Latin 1.
Than you can read all bytes and later use DefineEncoding on substrings to switch to whatever it is.

DaveS · October 18, 2015, 6:56pm

Strange this is… 90% of the data that is working contains characters >0x7F … including some with FF FE (UTF-16) and it is decoding it correctly…

But the 0x83 vanishes, and screws the pointer up… since it is all based on offsets

DaveS · October 18, 2015, 7:06pm

That was easy… I did not realize you could interchange MemoryBlock and String

Where I was reading FILEBUFFER as STRING=TEXTINPUTSTREAM
I changed it to
FILEBUFFER AS MEMORYBLCOK = BINARYSTREAM

and the rest of the code remained the same. and now it works!

Thanks Christian for point me in the right direction at least