Cstring text encoding issue

I am calling a function in an external DLL. I need to send a parameter and a value as cstrings. The DLL reports errors to a logging tool that I can monitor. I keep getting an error when calling the function that the parameter doesn’t exist (that’s the DLLs internal error checking reporting back to me).

The external logging tool implies that the problem is that the cString isn’t properly encoded, which would explain the DLL error code I’m getting:

<Cannot decode byte '\xe7': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream>

How can I ensure a cString I am sending this DLL is UTF-8 encoded?

What I’m doing now is:

var cParam as CString = param //param is a string passed to this xojo method

What’s the text in param and how is it encoded?

The text is set inside the application. In this case it’s:

var param as string = "ExposureTime"
var value as string = "250000"

param and value are sent to a method that calls the external function. I’m not doing anything to explicitly set the text encoding. Which is why I’m asking how/where I do that.

There’s no byte e7 anywhere here. Something seems broken elsewhere.

E7 is “ç” in Latin1.

You have no code above 7F.

Hmm. that’s weird. So it could be something in the DLL? I do have the source code for it but can’t share it publicly. I don’t really know enough about C to know where that might creep in though.

Some further observations:

  1. If I call the function repeatedly, the invalid character changes with each call, even though I’m sending the same text

  2. In this particular case I’m calling a “Set” parameter function. There are also “Get” parameter value functions. I tested with that and get the same thing (and again, different invalid characters with each call even though the call is identical each time)

Is it perhaps some random text form further along in memory. C strings are supposed to end in ascii 0. Assignment to a cstring should do that for you, but if it didn’t for some reason the DLL may be reading past the end of your data.

Have a look at the data you are about to pass (ie the cstring variables) in the debugger just as you are about to make the call. Do they have ascii 0 termination?

When i put a break in the debugger right before calling the function in the external DLL, the text encoding for the param cstring shows as “nil” in the debugger.

I’m not sure how to see what the termination character is - how do I do that?

A few thoughts…

Internally, Xojo encodes strings as UTF-8, so your string literal is automatically encoded.

In UTF-8, any character whose code is 127 or lower (the ASCII range) is just a single byte that represents its code. For your string, there is no difference between encoding it as UTF-8, ASCII, or even Nil.

As Ian alluded, it looks like you are experiencing some kind of memory overrun. That is, the DLL is reading past the end of your string, which makes me think that perhaps it’s expecting a Pascal string rather than a C string.

Pascal string is when the first byte is the length of the string, whereas a C string is when the last byte is a NULL (ASCII 0, as Ian mentioned).

To create a Pascal string, you’d do:

var cParam as CString = ChrB( param.Length ) + param

You can also manually put a Null at the end of your string:

var cParam as CString = param + ChrB( 0 )

Let us know if either of those work.

Double click the string and use the tabs to see the different representations one of them is binary. You should see the zero at the end.

Look at the string in the debugger and see how long it is, and select Binary so you can see the bytes and tell whether the last one is &h00.

I just checked, the debugger doesn’t show the trailing Null, even if I add it manually to the CString.

nope. And I can confirm that there is no trailing null with or without adding it manually.

The possibility that there’s something weird happening with memory is there. Most of the C functions are done byref. I just set it all up so that instead of passing a variable within xojo from one method to another and then to the C function, I am setting two properties in the class instance of type cString with the parameter and value text in them - fgParam and fgValue. Then I am passing those properties to the external function. I get the same result though.

This is in a button that should set the exposure time to 250000:

var result as boolean = false

var cParam as CString = "ExposureTime"
var cValue as CString = "250000"

FrameGrabber.fgParam = cParam
FrameGrabber.fgValue = cValue

result = FrameGrabber.CXPSetCameraParam(FrameGrabber.fgParam,FrameGrabber.fgValue)

CXPSetCameraParam() - the internal method in my Xojo class. (not shown is the code that returns true or false based on the result)

//fgHandle = a property; the handle to the frame grabber that all calls must reference
//fgParam = a property; verified it contains the value set above
//fgValue = a property; verified it contains the value set above

var cameraParam as int32 = SetCameraParam(fgHandle, fgParam, fgValue)

SetCameraParam (The external method from the DLL)

Private Function SetCameraParam(byref fgHandle as int32, byref fgParam as cstring, byref fgValue as cstring) As Integer

And the actual C code from the DLL’s .h file:

// Change Camera Parameter Value
COAX_GRABBER_API int32_t SetCameraParam(int32_t * handle, char * param, char * value);

The difference now is that the error in the frame grabber’s logger is:

<Cannot decode byte '\x95': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream>

and it’s always x95, it doesn’t seem to change anymore.

Nevermind that. it happened three times in a row, now I’m getting a different result each time again. I tried it 10 times and made it repeat once otherwise always a different offending character.

AHA! The problem is using byref on the param and value cstrings. I took that out and it now works.

Glad you sorted it. This part doesn’t seem to add anything to your code. You could just pass in cParam and cValue into the function.

Yeah I just cleaned all that up. it was just there for testing. I just set up a generic “SetParameter” method that you can pass the type of setting (frame grabber board, camera, output stream), and the key/value pairs. On the back end for the frame grabber that’s all it is - key/value text for everything.

As of right now, it’s all running nicely (ignore the blank text fields, I’m not populating those with the stored values in the camera just yet). and the image looks weird because the 3D printed lamphouse has to be redone. There are some weird reflections. The image is of the northern lights in Norway, shot on 10 perf 70mm film (you can see the perfs at the bottom). The gate in the film scanner is a little big, so we can accomodate up to 15 perf IMAX film.

Fair enough. It kinda looked like a structure, which could have been making a difference to your problem.

Looks interesting.

1 Like