Hebrew in TextArea not rendered correctly on Windows

Gert_Van_Assche · October 3, 2018, 3:49pm

Hi All,

I made a Windows application for people to select errors in plain text.
They highlight the error and place tags around it by pressing on a button.
For Hebrew, the text rendered is not correct if there are non-Hebrew characters in the sentence. The order of Hebrew strings and non-Hebrew strings is messed up.

How can I force the text to be read RTL (because this is where I think the problem comes from)?

Thanks

Gert_Van_Assche · October 4, 2018, 12:32pm

All,

the same dataset does not seem to cause a problem in a tool written in C#. I hope I should not redevelop this tool in C#.
For sure I’m not the first one who bumps into this; I guess Arabic and Farsi would suffer the same rendering issue.
Others reported the same issue on the forum. But I haven’t found the solution yet. I hope someone has a solution…

thanks

Gert_Van_Assche · October 9, 2018, 8:34am

Hi all, I’m still struggling with this. Does anyone have a solution? – thanks

Louis_D · October 9, 2018, 6:29pm

Have you defined the text encoding? Non-Hebrew strings may well be ASCII (single byte) while Hebrew strings probably need to be UTF-8 (double byte). The problem occurs when the system does not expect to manage UTF-8 text. Define all strings as UTF-8 before assembling them and there is a good chance that all will be just fine. I am assuming that you have control over the source strings individually. If not, try to define the encoding as soon as you read the string from the source. If that does not work, you will have to resort to more elaborate techniques that I can’t think of right now.

I am not familiar with Hebrew, so perhaps yet another encoding spec is required. Xojo provides several that you could try if UTF-8 does not work.

Gert_Van_Assche · October 9, 2018, 6:38pm

Hi Louis, thanks for replying. Yes the source is UTF-8. If I render the text as HTML (with the right attributed for language and writing direction) the rendering is correct, even the BIDI texts. I did not test in the other Xojo UI elements but in the TextArea I don’t get it right. I don’t see how to define the writing direction even. I can set the alignment of course but that’s not a solution.
I have the impression XOJO handles UTF-8 strings with BIDI writing correctly, but the TextArea does not show it correctly. At least not on Windows. I don’t know on Mac.

Gert_Van_Assche · October 10, 2018, 11:27am

Maybe someone knows a plugin to handle RTL and BIDI texts correctly. Anyone? Please?

Beatrix_Willius · October 10, 2018, 11:34am

I’ve had troubles with Hebrew text on the Mac a while ago. I don’t think the text was utf8 encoding though. Have you tried the new framework?

Gert_Van_Assche · October 10, 2018, 11:39am

which framework?

Beatrix_Willius · October 10, 2018, 12:08pm

Sorry, my bad, didn’t read enough. You don’t have internal text. The new framework won’t help you there.

Damien_Callaghan · May 18, 2019, 3:15am

I wrote a XOJO app that gave me a method to type a line like this:

??? ??? ??? ?In the beginning God created

If I copied and pasted it into MS Word everything was ok, but if I then typed an English character in the middle of the Hebrew strange things would happen, for example,
??? G??? ??? ?In the beginning God created
You can see how the first Hebrew word is now the last word and all I did was type a G before the Hebrew bara. It looks weird but it is correct. The G breaks the right-to-left sequence of Hebrew characters. After the G, the Hebrew continues right-to left so the last two Hebrew words now appear first (in the sequence of Hebrew words).

It is some time since I played with this so I am going on old memories, but it has to do with the Unicode right-to-left (rtl) mark and rtl override characters (U+200F and U+202E). I played around and eventually worked out how to use the rtl codes to preserve things, but as I said it is quite a while since I did this and I never saved any hard code. I hope this helps you in some way.

Also, I found that text areas in Windows work better than text fields for languages such as Hebrew. Text fields are pretty well useless in my experience. Last time I tried they did not support Hebrew very well at all.

Gert_Van_Assche · May 18, 2019, 9:13am

Hi Damien, thanks for your remark. The text area is better indeed, but as I cannot alter the text (by adding RLT/LTR Unicode), I decided no longer to use XOJO for this project. I now redid the whole development in NodeJS.

Geoff_Perlman · May 18, 2019, 11:21am

Hi Gert,

If you have a small sample project that demonstrates the problem, I’d really like to see it.

Geoff_Perlman · May 18, 2019, 12:17pm

[quote=437046:@Damien Callaghan]I wrote a XOJO app that gave me a method to type a line like this:

??? ??? ??? ?In the beginning God created

If I copied and pasted it into MS Word everything was ok, but if I then typed an English character in the middle of the Hebrew strange things would happen, for example,
??? G??? ??? ?In the beginning God created
You can see how the first Hebrew word is now the last word and all I did was type a G before the Hebrew bara. It looks weird but it is correct. The G breaks the right-to-left sequence of Hebrew characters. After the G, the Hebrew continues right-to left so the last two Hebrew words now appear first (in the sequence of Hebrew words).

It is some time since I played with this so I am going on old memories, but it has to do with the Unicode right-to-left (rtl) mark and rtl override characters (U+200F and U+202E). I played around and eventually worked out how to use the rtl codes to preserve things, but as I said it is quite a while since I did this and I never saved any hard code. I hope this helps you in some way.

Also, I found that text areas in Windows work better than text fields for languages such as Hebrew. Text fields are pretty well useless in my experience. Last time I tried they did not support Hebrew very well at all.[/quote]

I tried both textfield and TextArea on Windows 10 and on Mac using your original sample Hebrew phrase at the top. Xojo seemed to behave the same as Windows 10 Notepad. I copied and pasted a Hebrew character from the phrase into the phrase. I also typed roman characters into the middle.

However, the same was not true on the Mac. Apple’s Pages app did not behave the same way as Xojo. I’m not sure which is right or why there’s a difference. My findings can be found on the case: <https://xojo.com/issue/44742>.

Damien, if you can help us understand which behaviors are right and wrong, it would be greatly appreciated.