ConvertEncoding…?

Paul_Bouvier · April 12, 2022, 8:47pm

I have developed a small experimental program as a demo to use powershell and Xojo. I have a problem and it is that I can not convert the powershell encoding to Xojo, so the words with an accent (in Spanish) appear with wrong encoding in the text area of Xoxo. Tried with TextEncoding, but with woring results. Any suggestion? Thanks in advance for your help!

Program under OK button . Action:

DIM sh As New Shell
Dim f As FolderItem = GetTemporaryFolderItem

sh.Execute(“wmic csproduct get /all /format:htable”)
TextArea1.Text = sh.Result
HTMLViewer1.LoadPage(TextArea1.Text, f)

Rick_Araujo · April 12, 2022, 8:57pm

Try

<html>
<head>
 <meta charset="UTF-8">
</head>
<h3>
...

sh.Execute(“wmic csproduct get /all /format:htable”)
Var s As String = sh.Result.Replace("<html>", "<html><head><meta charset=""UTF-8""></head>")
HTMLViewer1.LoadPage(s, f)

Ian_Kennedy · April 12, 2022, 9:35pm

Even if you added the HTML5 “< ! DOCTYPE html >” (without the spaces) to the start there’s a problem with the HTMLViewer that it doesn’t conform to the HTML5 standard and default to UTF-8. The only solution is to add the meta tag as Rick A suggests.

<https://xojo.com/issue/64259>

Tim_Hare · April 12, 2022, 10:44pm

Should you be using DefineEncoding?

Sascha_S · April 13, 2022, 6:56am

sh.Result.DefineEncoding(Encodings. …)

Michel_Bujardet · April 13, 2022, 11:19am

Powershell and command prompt should be using encodings.WindowsANSI (CP-1252).

DefineEncoding should suffice.

Greg_O · April 13, 2022, 12:46pm

Just to be clear, the fact that you are putting it into a textarea first is part of your problem. The TextArea is going to try to coerce the text to be UTF8 if it has no encoding. You’d be better off just casting it first:

Dim data as string = DefineEncoding(sh.result, Encodings.WindowsLatin1)
Data = ConvertEncoding(data, Encodings.UTF8)
TextArea1.Text = data
HTMLViewer1.LoadPage(data, f)

Someone correct me if I’m wrong on the WindowsLatin1 thing. I’m typing this from memory.

Rick_Araujo · April 13, 2022, 1:48pm

There’s few things to observe from the presented sample from the OP:

1 . He gets the content set for a HTML output, that hints me the possibility of a standard current UTF-8 output.
2 . He gets such stream and moves it to a text box set to present utf-8 content, without changes, and it presents the accented content correctly. So, it seems native UTF-8 and seems to corroborate with the #1 suspicious.
3 . The bad presentation found in HTMLViewer shows a 2 byte coding of the accented char, compatible with an utf-8 coding, instead of a 1 byte extended charset as Latin1 or cp-1252.

So, I assumed that the coding was ok, but the HTMLViewer engine would need to be explicitly informed of the coding using the meta charset. And that way, it would be presented correctly.

Paul_Bouvier · April 13, 2022, 2:39pm

Thank you very much, Rick. This seems to be the way, but I keep getting an unknown character symbol (question mark). I’m afraid that Powershell in Spanish is not very Xojo friendly, because even when I format in powershell with the command “format:list” in the text area it doesn’t appear as a list either, but all followed.

Paul_Bouvier · April 13, 2022, 2:40pm

Thank you very much, Ian!

Paul_Bouvier · April 13, 2022, 2:43pm

Thank you very much, Tim for the suggestion! Will try it as with Greg’s sample.

Rick_Araujo · April 13, 2022, 2:47pm

That ONE would output a Windows charset as people mentioned. I would define it as Latin1 as Greg said.

Paul_Bouvier · April 13, 2022, 2:49pm

Thank you very much, Greg. I’ll try your solution. I tried to do it with Convert Encoding, but did no get the point of wich encodings to use.

Paul_Bouvier · April 13, 2022, 2:51pm

Thanks a lot, Rick. Yes, Greg’s sample seems o be the solution.

TimStreater · April 13, 2022, 2:53pm

The PowerShell will generate it in Latin1, but you will need to tell your Xojo program that this is the case; it has no way of knowing, otherwise. Hence the DefineEncoding. Then use ConvertEncoding to convert that to UTF8 - this will actually alter the bytes you have. Then you can put that into a TextArea.

Paul_Bouvier · April 13, 2022, 2:57pm

Thanks a lot, Greg!

Paul_Bouvier · April 13, 2022, 2:58pm

Thank you very much, Tim!