HTMLViewer encoding

Hi,

HTMLViewer is unable to render utf-8 encoded text correctly. They appear as garbage
Is this a known issue?

htmlText = “”
htmlText = htmlText + “<meta http-equiv=”“Content-Type”“content=”“text/html; charset=utf-8"”> "
htmlText = htmlText + “here goes the utf-8 text: Ÿà®µà®£à¯ˆ” + “”
htmlViewer1.LoadPage(htmlText, f)

Nope, your code works just fine here. Are you using Windows or Mac? If Windows which renderer? Which version of Xojo do you use? How does the result look like from your code?

Mac? Win? Linux?
On Mac It may default to utf-8, but maybe not on Linux/Win.

[quote=219856:@Siva K]Hi,

HTMLViewer is unable to render utf-8 encoded text correctly. They appear as garbage
Is this a known issue?

htmlText = “”
htmlText = htmlText + “<meta http-equiv=”“Content-Type”“content=”“text/html; charset=utf-8"”> "
htmlText = htmlText + “here goes the utf-8 text: Ÿà®µà®£à¯ˆ” + “”
htmlViewer1.LoadPage(htmlText, f)[/quote]

You must read this : https://en.wikipedia.org/wiki/Character_encodings_in_HTML#Character_references

use an use EncodingToHTMLMBS function to encode to html properly.

Hi Christian,

Thank you for your answers. I am using Windows 7, 64 bit machine.
From the above example, if I use htmlText string and save it and view using Firefox or IE, it renders correctly. (I am familiar with UTF-8 Vs Unicode details.) If I load the html file directly from the disk the HTMLViewer is rendering UTF8 correctly.

However if I assemble the html page on the fly, the rendering is not good.

I also tried the following, and not working:

[code] Dim f As FolderItem = GetTemporaryFolderItem
Dim t As String

t = TextArea1.Text.ConvertEncoding(Encodings.UTF8)
HTMLViewer1.LoadPage(t, f)[/code]

Could you explain a bit on EncodingToHTMLBS. I could not find a function in Xojo.

For that you need MBS Plugins

Hi Beatrix,

Here is it how it looks

What is exactly the text you want to display ?

In your original post, what do you actually expect this code will show : " Ÿà®µà®£à¯ˆ"

Are you trying to have the browser render Unicode Text ? Would these characters be representing the bytes of accented characters or high codepoint symbols by any chance ?

Something like that ? :

here goes the utf-8 text: ???

I had to create an HTML file with your example and feed it into FireFox, IE or Edge.

I did verify the HTMLViewer was not able to display these characters correctly, though. Seems to be a bug indeed.

OK. I tried diverse workarounds but could not succeed displaying these characters.

Amazingly enough, dragging the file over the HTMLViewer does display them fine, but not using LoadPage or LoadURL.

I even tried the Microsoft Web Browser Active X, but navigate leads to the same result.

I suspect the Xojo framework does not support higher ASCII, which is demonstrated by trying to display Roman accented characters such as which end up showing as éèç ù .

This is indeed a bug, or at least a limitation that was not envisioned when implementing the code that supports loading HTML.

This is worth a bug report with a sample project IMHO.

Now, here is the workaround, directly inspired by the HTMLEntity Wikipedia page I linked to and advised to read before.

Here is the code of the page I place into a constant named html (much easier than the cumbersome assembly described in the OP).

[code]

???
[/code]

Note that I simply entered the characters directly, without resorting to the unreadable Unicode two bytes text.

Now, in order to display the higher ASCII characters, they are transformed in HTML Entities with the following method :

Private Function EncodeHTMLEntities(s as string) As string dim result as string for i as integer = 1 to len(s) dim ch as string = mid(s,i,1) if asc(ch) >127 then result = result+"&#"+str(asc(ch))+";" else result = result+ch end if next return result End Function

The resulting code to display the html constant into the HTMLViewer Open event is :

Sub Open() dim f as FolderItem = GetTemporaryFolderItem me.LoadPage(EncodeHTMLEntities(html),f) End Sub

Thank you Michael. I was planning to write a function to encode. You have it working!

  • issue resolved -

I just filed a bug report :
<https://xojo.com/issue/41145>

I just changed the charset on the generated html to UTF-8 and things appear as expected for me on Windows 7 using the sample
Made no other change
Not sure why you’re setting it to ISO-8859-1 and giving it UTF-8 data

[quote=220110:@Norman Palardy]I just changed the charset on the generated html to UTF-8 and things appear as expected for me on Windows 7 using the sample
Made no other change
Not sure why you’re setting it to ISO-8859-1 and giving it UTF-8 data[/quote]

Thank you for that enlightening post. I should have thought about it before filing a bug report. The most amazing is that if a file containing that code with ISO-8859-1 encoding is dropped over the HTMLViewer, it displays just fine. Likewise, the Mac version displays the characters fine using the OP’s code.