Convert Xojo.Core.MemoryBlock to classic String

I need to mix new Xojo.Core classes and older code, and I am having trouble converting a Xojo.Core.MemoryBlock to a classic String.

My specific example is that I have a new Xojo.Net.HTTPSocket object. In this object’s PageReceived handler, I am handed “content”, which is a Xojo.Core.MemoryBlock. I need to call a subroutine that requires the content of the page to be passed as a classic String object. To make matters slightly worse, the page data I am receiving contains binary data (unprintable bytes, ranging from &h00 thru &hFF).

I have tried converting the Xojo.Core.MemoryBlock to Text first, and then to a String like this:

dim t as Text = Xojo.Core.TextEncoding.UTF8.ConvertDataToText(content) dim s as String = CType(t, String)

But this causes a runtime exception in ConvertDataToText(), because content contains bytes that aren’t printable characters.

Does anyone have a better way to get a String from a Xojo.Core.MemoryBlock?

Although String was meant to represent text, it’s actually a bucket of bytes with an encoding telling the system how to interpret those bytes. The way you’re using it, it’s actually closer to what a MemoryBlock is.

Since your bytes represent data, not text, do not go through Text but copy the bytes directly to a classic MemoryBlock. From there, it’s an easy conversion to String if you still need it.

Thanks Kem. Given your response, I ended up with this code:

dim mb as New MemoryBlock(content.size) mb.StringValue(0, content.size) = content.CStringValue(0) dim s as String = mb

This does appear to do what I need.

Does anyone have a more direct or efficient way of making this conversion?

This won’t work if your data has any null bytes. You should go byte by byte to copy it.

dim mb as new MemoryBlock( content.Size )
dim lastByteIndex as integer = content.Size - 1
for i as integer = 0 to lastByteIndex
  mb.Byte( i ) = content.Data.Byte( i )
next

Kem, you’re right, my solution above does not work. When a null byte is encountered in the source data, the rest of the copy will be padded with nulls. Or, worse yet, if the source data doesn’t happen to have any null bytes, then an exception is thrown by CStringValue. Either way, it’s bad code.

I have confirmed that your code that copies the data one byte at a time does work. However, it is obviously super-inefficient.

As I have continued to search for solutions, it appears that the Ptr data type has a documented method String() that seems to imply that it would return a String version of the data. However, although the docs and auto-complete in the IDE seem to know that Ptr has a String method, the compiler does not.

This seems to reveal a problem with the API for Xojo.Core.MemoryBlock. As of Xojo 2015r2.2, there doesn’t appear any native way to get a String (or classic MemoryBlock) from a Xojo.Core.MemoryBlock without copying every single byte, one byte at a time. With large data sets, this will be painful! Ouch!

Why not:

mb.StringValue(0, content.size) = content.StringValue(0, content.Size)

?

[quote=193585:@Andrew Lambert]Why not:

mb.StringValue(0, content.size) = content.StringValue(0, content.Size)

?[/quote]

Because Xojo.Core.MemoryBlock does not have StringValue…
http://developer.xojo.com/xojo-core-memoryblock

I don’t think it’s that bad, and you can certainly speed it up more by using a Ptr to the classic MemoryBlock too.

Another idea: Calculate the number of blocks of 8 bytes and copy UInt64 values that many times, then copy the last 1-7 bytes manually. You have to make sure the endiness of the two match.

Need that code?

Ug, I’m an idiot.

Here you go:

dim mb as MemoryBlock = content.Data

This does not clone it, but gives you access to the data through the classic MemoryBlock.

Kem,

Sorry, that doesn’t work. It’s close, but not functional.

dim mb as MemoryBlock = content.Data

When this is executed, it does create a MemoryBlock and when inspected with the debugger, it does point to the correct data, but its “Size” is set to -1, which means Unknown…

Then, when it is cast to a String, the resulting String is empty.

If I try to manually set the size like this:

dim mb as MemoryBlock = content.Data mb.Size = content.Size

The setting of the Size property causes the MemoryBlock to pad the new size with nulls, which kills the data.

Then try this:

dim mb as MemoryBlock = content.Data
dim s as string = mb.StringValue( 0, content.Size )

Or…

dim temp as MemoryBlock = content.Data
dim mb as new MemoryBlock( content.Size )
mb.StringValue( 0, content.Size ) = temp.StringValue( 0, content.Size )
temp = nil

When an old style memoryblock is crafted from a pointer (which this basically is) the size is set to -1 meaning “unknown”
But you can still read the data

dim mb as MemoryBlock = content.Data dim s as string = mb.StringValue(0, content.Size) ought to do

Aha! Now we have it! Cast to MemoryBlock first, then use StringValue to grab the (size undefined) data from the store of the correct size.

Here’s what I ended up with:

Function ToString(extends dst as Xojo.Core.MemoryBlock) As String return(CType(dst.Data, MemoryBlock).StringValue(0, dst.Size)) End Function

Thanks Kem and Norman!

[quote=193590:@Michel Bujardet]Because Xojo.Core.MemoryBlock does not have StringValue…
http://developer.xojo.com/xojo-core-memoryblock[/quote]
Really? I’m still plugging along in Realstudio so I haven’t kept up with the new framework, but that seems like a significant omission without any obvious benefit.

Memoryblock are runs of bytes
Bytes are not characters & characters are not bytes

To convert Text to a MemoryBlock, use TextEncoding.ConvertTextToData
This way the data conversion is encoding savvy and you get the correct bytes when you convert text to bytes using a given encoding.

To convert a MemoryBlock to Text, use TextEncoding.ConvertDataToText
And this will interpret the bytes in a memoryblock using the encoding to create the text

It’s not an omission, it’s a change of approach.

As we have just now yet another question in the forum about encoding in another thread, it is fitting we talk about memory blocks and strings.

Encoding has been a thorn in the foot of RB/Xojo programmers for decades, for as long as strings became something else than blocks of bytes.

The new framework replaces strings with the new Text type, which has no implicit conversion with byte blocks, so the pesky question marked lozenges no longer show at every turn. Memory blocks were one of the major offenders in that respect.

I tend to regard the new framework a bit like an automatic car versus shifting gears. Much less sporty, but superb comfort. The new memory block is not intended so much for performance benefits, but to prevent, (as Xojo puts it) users from shooting themselves in the foot :wink: