returning non-Ascii char from a shell

  1. 4 months ago

    Robert L

    Jun 9 Federal Way, WA (Seattle Area)

    I am trying to return a non-ASCII character (for example é) from a Python shell script.

    Dim s As String
    s = myShell.Result
    MsgBox(s)

    The Python script itself for this simple example is just

    someChar = 'e'
    print(someChar)

    The print command in Python should put the string variable someChar into the output buffer.
    The command s = myShell.Result in Xojo should put the output buffer into s.
    This works fine.
    s gets assigned "e"

    But if someChar is not in the ASCII range I get an error.

    someChar = 'é'
    print(someChar)

    for s I get an error code.
    -- UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 0: ordinal not in range(128)--

    The error message shows that it has "gotten" the é in some sense because
    u'\xe9' is a Unicode string that contains the unicode character U+00E9 (LATIN SMALL LETTER E WITH ACUTE).

    I have tried in vain (I still get the error)

    s = myShell.Result.ConvertEncoding(Encodings.UTF8)

    Is there a way to return a character like é from a Shell script that is called from Xojo?

    I think Jason means

    sys.stdout = codecs.getwriter("utf-8")(sys.stdout)

    or, in python 3.1 and later

    sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())
  2. James M

    Jun 9 Pre-Release Testers, Xojo Pro

    What happens if you DefineEncoding instead of ConvertEncoding?

  3. Robert L

    Jun 9 Federal Way, WA (Seattle Area)

    What happens if you DefineEncoding instead of ConvertEncoding?

    Same error message

  4. jim m

    Jun 9 Pre-Release Testers, Xojo Pro Phoenix, Arizona

    (I'm not a Python expert)
    Is Python using ascii encoding to read the script?

    try adding an encoding to the script.?

    #!/usr/bin/python
    # -*- coding: utf-8 -*-

  5. Robert L

    Jun 9 Federal Way, WA (Seattle Area)

    Python is not reading a script in the scenario that I was trying to describe.
    Python uses utf-8 as the default and is happy to deal with strings such as cliché internally.

    If you use Terminal and fire up python3 within it , then commands like
    print("cliché")
    work perfectly. (ie you see cliché displayed appropriately in the Terminal window)

    As understand it, when you launch a Python script in a Shell in Xojo , anytime the print command is encountered in that Python script, what was "printed" is added to the output buffer. When the Python script completes, whatever ended up in that buffer is accessible to Xojo as

    myShell.Result -- takes whatever is in the output buffer.

    This allows Xojo to grab anything that was "printed" in the course of the Python script.

    But Xojo is not happy when that buffer contains characters like é .

    I am not a Python or a Xojo expert so I am always operating somewhat in the fog. :)

  6. Beatrix W

    Jun 9 Pre-Release Testers Europe (Germany)

    I remember Python being very picky about encodings. Have you tried to set the encoding in Python?

               error_string = 'error: delete_mail exception ' + str(current_mail - 1) + ' ' + str(result)
                error_string = error_string.encode('utf8')
                print(error_string)
  7. Robert L

    Jun 10 Federal Way, WA (Seattle Area)

    Beatrix: Your suggestion moves the needle.

    In Python in the test script, I had

    sDecoded = 'é'
    print(sDecoded)

    When I was back in Xojo and I tried to recover the output buffer (myShell.Result ), I would get an error.

    But if I do as you suggest and add a line in Python

    [code]sDecoded = 'é'
    sDecoded = sDecoded.encode('utf8')       #Beatrix suggestion
    print(sDecoded)[/code]

    Then I no longer get an error. Rather myShell.Result returns

    b'\xc3\xa9'

    And you can capture this in a string variable without Xojo complaining.

    Now the question is, how, in Xojo, can you get this back to 'é'

    I cannot find Xojo documentation on these peculiar strings that start with a b'x......'
    {somehow they are bytes and not strings -- I can say these words without really understanding them :) )

    In Python, such things can be converted to regular strings

    desiredString=b'\xc3\xa9'.decode('utf-8')

    If you have fired up Terminal and are using it as a place to write Python code, type

    >>> b'\xc3\xa9'.decode('utf-8')
    and you will in fact get
    'é'

    But I do not know what command in Xojo takes
    b'\xc3\xa9'
    and pulls out
    é

  8. Beatrix W

    Jun 10 Pre-Release Testers Europe (Germany)

    Would brute force work? Discard the b and the ', do a split on "\x", do a join to a memoryblock, get the string of the memoryblock.

  9. Michel B

    Jun 10 Pre-Release Testers RubberViews.com

    Which platform is it ? Mac ?

  10. Robert L

    Jun 10 Federal Way, WA (Seattle Area)

    Which platform is it ? Mac ?

    Should have specified
    Mac OS 10.13.5
    Xojo 2018 r1.1

  11. Jason P

    Jun 11 Xojo Inc http://xojo.com/

    @RobertLivingston for s I get an error code.
    -- UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 0: ordinal not in range(128)--

    This appears to be an error from Python. It's definitely not a Xojo or shell error message. Remember that a shell is *not* the equivalent of a Terminal window. Terminal runs a number of scripts when you open a new one - a shell does not so the runtime environment is different.

  12. Robert L

    Jun 11 Federal Way, WA (Seattle Area)

    This appears to be an error from Python. It's definitely not a Xojo or shell error message.

    Beatrix got me past that error by adding a line in Python which prevented this problem.

    I can see what you are staying. Rather than what Xojo expected to find in the output buffer there is this text explaining there have been problems. So it is not a bug in Xojo or even an error in my Xojo code. Xojo is just telling me there is not what I had expected/hoped in the output buffer. Rather it is just sad news from Python. And that in turn is related to the fact that shell is not Terminal and what works in the latter need not be expected to work in the former.

    ****

    For now, I am stuck at this point. Xojo has grabbed b'\xc3\xa9' from the output buffer and there are no complaints of problems. I just do not see the easy way to to go from b'\xc3\xa9' to é

    Beatrix suggests that I brute force it which seems sensible and I will try that if nobody can tell me a "simpler" way. I have paused because dealing with memory blocks is not something that I have done before and that is the brute force route apparently.

  13. Jason P

    Jun 11 Xojo Inc http://xojo.com/

    Here’s an example that works on macOS with no special handling of the result from python:

    https://blog.xojo.com/wp-content/uploads/2018/06/PythonShell.zip

    Note that the python script that's run just declares the output is utf-8.

  14. Robert L

    Jun 11 Federal Way, WA (Seattle Area)

    Note that the python script that's run just declares the output is utf-8.

    This is the Python script that I am calling something.py. I am using Python 3 rather than 2 but I assume that makes no difference

    sAccentedE = 'é'
    sAccentedE = sAccentedE.encode('utf8') #Assume that this is what is meant by declaring the output as utf-8.
    print(sAccentedE)

    I assume that when Jason says

    just declares the output as utf-8

    sAccentedE = sAccentedE.encode('utf8')

    is what is meant.

    This is the Xojo code that Jason provided

    Dim myshell As New Shell
    
    Dim f As folderitem = GetFolderitem("something.py")
    
    Dim s As String
    //myshell.Execute("/usr/bin/python " + f.ShellPath) // Jason's machines path to Python - I would assume Python2
    myshell.Execute"/Users/owl/anaconda3/bin/python "+ f.ShellPath) // my machines path to Python3
    s = myShell.Result
    MsgBox(s)

    When I run this,

    b'\xc3\xa9'

    shows up in the MsgBox

  15. Tim H

    Jun 11 Pre-Release Testers Answer Portland, OR USA

    I think Jason means

    sys.stdout = codecs.getwriter("utf-8")(sys.stdout)

    or, in python 3.1 and later

    sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())
  16. jim m

    Jun 11 Pre-Release Testers, Xojo Pro Phoenix, Arizona

    It doesn't work if your script looks like this?

    # -*- coding: utf-8 -*-
    sAccentedE = 'é'
    print(sAccentedE)

    That's what Jason is talking about I think. (See my previous post. I was thinking it was the encoding-in rather than out.)

  17. Robert L

    Jun 11 Federal Way, WA (Seattle Area)

    Thanks for chiming in Tim.

    Jason just made the mistake of thinking I was more savvy with Python than I am. :)

    To summarize what works in my test case.
    Here is the test python code. (Python3)
    In a file called something.py

    import sys
    import codecs
    sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach()) # Slightly different formulation for Python2
    sAccentedE = 'é'
    print(sAccentedE)

    Here is the Xojo code that wants to get the non-Ascii character (é) from Python.

    Const PATH_PYTHON3 = "/Users/owl/anaconda3/bin/python"
    Const SPACE = " "
    Dim myshell As New Shell
    Dim f As folderitem = GetFolderitem("something.py") // Needs to be in same folder with Xojo code
    Dim s As String
     // myshell.Execute("/usr/bin/python " + f.ShellPath) -  Common location of Python2 on Mac
    myshell.Execute(PATH_PYTHON3 + SPACE + f.ShellPath)
    s = myShell.Result
    MsgBox(s)

    é appears in the MsgBox which was my original desire in the original post.

    So Jason and Hare get the green check from me :)

  18. Michel B

    Jun 12 Pre-Release Testers RubberViews.com

    Seems you are far from being alone facing this issue :
    https://duckduckgo.com/?q=%5Cxc3%5Cxa9&ia=qa

or Sign Up to reply!