I am using a command line tool via a Shell instance to return (as XML) the metadata of a medical imaging file. Pseudo-code:
[code]dim s as new Shell
dim result as String
’ ToolFolderItem and myFile as defined elsewhere and represent the location of the
’ enclosing directory of the command line tool and the file that the tool takes as input
’ The tool returns an XML dump
At this point, the Stringresult correctly contains the XML I want. However, I am trying to convert result to the Text datatype with the following:
dim resultText as Text
resultText = result.ToText
but this throws a RuntimeException saying that “The data could not be converted to text with this encoding”. However, the debugger is saying that the Encoding type of result is UTF-8.
I could just manipulate result as a String but I’m trying to use the new Text datatype wherever possible because it has nice convenience methods. On a related note, it’ll be nice when Xojo add the Shell class to the new Xojo framework…
What’s annoying me is I can display result (which is a String) in a TextField fine with seemingly no funny characters but I just can’t convert it into the Text datatype.
I had similar problems but only when build in 64bits and not in debug mode.
I was able to solve it (because my shell output is simple) by doing DefineEncoding ascii and then DefineEncoding to utf8 right at the start. That seemed to solve it.
When I keep it as string, I had issues with the Split command because somehow Split doesn’t work well (or at least very unreliable) with strings in 64 bits. It seems ok at first but I’m getting weird results when trying to make it Text afterwards.
I’m having more issues with encoding but only in 64bits.
[quote=230962:@Kem Tekinay]False means, no matter what else the debugger says, it’s not really valid UTF-8 and that’s why it can’t be converted to Text.
Try:
result = result.DefineEncoding( Encodings.SystemDefault )
t = result.ToText
[/quote]
… meaning to use DefineEncoding.
But is seems you tried it with ConvertEncoding, hence the error message containing “conversion”.
So maybe this would work:
Dim t As Text = result.DefineEncoding(Encodings.UTF8).ToText
if s.Encoding = nil then
if encodings.UTF8.IsValidData(s) then
s = DefineEncoding(s, encodings.UTF8)
else
// some fallback
s = DefineEncoding(s, encodings.WindowsLatin1)
end if
end if [/code]
I would do like this and use a fallback encoding for 8bit. Could be WindowsLatin1, MacRoman or ISOLatin or something else.
Turns out that contents of some of the XML tags contained raw data bytes (i.e. image data, etc) that had no encoding at all. I guess that why the Text datatype was failing. The workaround was to use the command line tool to exclude any tags that are non-text.
Id think that the tags with binary data would be easily identifiable and could just be skipped when parsing the XML in Xojo? Dont know your particulars so not sure if it would streamline your process or not. Figured it was at least worth mentioning.
ps - These arent DICOM files by any chance, are they? (shudders with bad memories)
You’re right about DICOM files Anthony. After 16 years of medical training, I have never come across something so painful to deal with as that standard. The phrase: “A camel is a horse designed by committee” comes to mind…