When I do a ls -a in the terminal I get accentuated characters (if any) correctly displayed.
When I do the same with Xojo Shell I get interrogation signs in place of the accentuated characters.
How can I set the Xojo shell to output accentuated characters? I don’t any place I can set that.
[quote=102723:@Stanley Roche Busk]When I do a ls -a in the terminal I get accentuated characters (if any) correctly displayed.
When I do the same with Xojo Shell I get interrogation signs in place of the accentuated characters.
How can I set the Xojo shell to output accentuated characters? I don’t any place I can set that.[/quote]
I could not reproduce the problem with the code below. How do you display the shell output ? Encoding must be wrong.
Sub Action()
dim s as new Shell
s.execute("ls -a /Users/Mitch/Desktop")
TextArea1.Text = s.result
End Sub
This would work if the s.ReadAll where UTF-8 but it is not (I don’t get those weird UTF-8 characters but plain and simple interrogation signs). I thought you could set the shell or sh output encoding, right now the shell output is simply broken.
The ? may appear when you define wrongly the encoding of a string. I do nothing here, I am looking at the raw output, the raw ASCII output already contains ? without defining nor setting any encoding. For example if the text were ‘Phnomme’ and the raw text where undefined UTF-8 I would get ‘Phénomène’, then I could simply define the encoding to UTF-8 to restaure the text properly. This is not the case, I don’t get ‘Phnomme’ but ‘Ph?nom?me’. You get a useless string, there is no magic for converting ? into the right accentuated characters Just my opinion…
@Stanley Roche Busk - you are correct that the mode makes a difference. I’ve just examined what’s going on and it is true going all the way back to Real Studio 2010r1.
I have tested further using my own code and DataAvailable. Both s.Result and s.ReadAll. Indeed the accented characters appear as three bytes, one with the basic character and two question marks.
Then I did a bit of research on the Internet and found several contributions about the phenomenon, reported for several different platforms such as Python and Perl. No solution, though.
Finally, I sent the result of the shell command to a text file with
s.execute( "cd /Users/Mitch/Desktop ; ls -al > result.txt" )
Now comes the interesting part : a text only editor such as Text Wrangler shows the same as Xojo : letter and question marks. Seems logical, but the same file fed into TextEdit shows the correct accented characters. Remember that the file was generated by the Unix command line interface and Xojo had no part in it.
I tried to apply diverse ConvertEncoding to it without success.
Conclusion : somehow, in the shell output, accented characters are coded through three bytes and the output contains the correct information. The question marks represent indeed different ASCII values. The text output in an hex editor show that for instance is represented by 65 CC 81 and by 63 CC A7.
So an output to file contains the proper accents. But not so for shell.result or shell.ReadAll, where the question marks are real question marks.
The workaround is therefore simple :
Send the output to a text file like I posted above
Open the text file and load it with a series of replacements such as :
t = t.ReplaceAll(chr(&h65)+chr(&hCC)+chr(&h81),"")
You will need to identify the accented characters encoding in a hex editor, but this fixes the problem.
[quote=103659:@Tim Jones]What if you use a TextInputStream to read the written file and apply the encoding there?
// knowing fileitem "f"
Dim TIS As TextInPutStream
TIS = TextInputStream.Open(f)
// assume it's ok
Dim theString As String = ConvertEncoding(TIS.ReadAll, Encodings.UTF16)
TIS.Close
TextArea1.SelText = theShell[/quote]
It is a lot more vicious than this
Not only ConvertEncoding does not work upon loading the file content, but &hCC & &h81 get converted to &u00C3 and &h00C5, so the replaceall I posted above does not work. On top of it, Xojo does not understand “é” correctly and makes it eÍ. So much for generalized internal UTF8. The proper code is :
te = te.ReplaceAll("e"+&u00C3+&u00C5,&u00E9)
I will continue with a binary stream but for the time being, this approach works. If a bit cumbersome.
I regret the OP did not post his code, but I have worked with DataAvailable, and used a shell file as outlined just above.
Here is what to do : call twice the shell, first to create the file, second to trigger DataAvailable. Then use this :
[code]Sub DataAvailable()
dim rien as string
rien = me.Result // Just to empty the buffer
Dim readFile as FolderItem = GetFolderItem("/Users/Mitch/Desktop/text.txt", FolderItem.PathTypeShell)
If readFile <> Nil Then
Dim ReadStream as BinaryStream = BinaryStream.Open(readFile, False)
ReadStream.littleEndian = true
Textarea1.Text=ReadStream.Read(ReadStream.Length, Encodings.UTF8)
End If
End Sub
[/code]