Odd character output from DateTime.ToString on macOS Sonoma

Mark_Franken · October 3, 2023, 3:04am

With the following code in a new app window.opening event I output the current time to a TextArea using the new and old DateTime objects to test:

Var d1 As DateTime = DateTime.Now
Var DateStrNew As String = d1.ToString(Locale.Current, DateTime.FormatStyles.None, DateTime.FormatStyles.Short)

Var d2 as New Date
Var DateStrOld As String = d2.ShortTime

TextArea1.Text = DateStrNew + EndOfLine + DateStrOld

I build this app as a Universal (Intel/Arm) app.

When I run this app on Intel MBP macOS 10.15.7 I get the following output for both date objects:
1:37 pm
Loading this text into an app to show the bytes I see:
31 3A 33 37 20 70 6D

When I run this app on a M1 MacMini with macOS Sonoma (natively and via Rosetta) I get the following output for both date objects:
1:37 pm
Loading this text into an app to show the bytes I see:
31 3A 33 37 E280AF 70 6D
The space character is represented by E280AF. I’m no expert on character codes but this is breaking my app that exports the data and time in the footer of my PDF using DynaPDF as DynaPDF doesn’t understand what this is.

Can anyone reproduce this? Is this intentional or shall I log this as a Xojo bug?

Many thanks

Xojo 2021r3.1
MBS 22.1

Beatrix_Willius · October 3, 2023, 3:08am

End of line is 0A. Not sure why you think that this is a problem.

Eric_Williams · October 3, 2023, 3:14am

The space between the time and “pm” is a Unicode Non-Breaking Space. This is a feature of typesetting that tells the rendering system “don’t break up these two pieces of text even though they have a space in between them”. So you’ll always see:

1:09 pm

and never

1:09
pm

You can safely replace it with a single space for use in your PDF:

pdfText = ReplaceAll(timeText, Chr(&hE280AF), " ")

Eric_Williams · October 3, 2023, 3:16am

As an aside to the Xojo engineers working on PDF support: non-breaking spaces and non-breaking hyphens are very useful in typesetting and should be a part of the PDF functionality.

Mark_Franken · October 3, 2023, 3:16am

Only that the 20 (space) character is represented by E280AF, not 0A.

Mark_Franken · October 3, 2023, 4:34am

I found I had to update this to the following to get it to work:
pdfText = ReplaceAll(timeText, Encodings.ASCII.Chr(&hE280AF), " ")

Thanks for the explanation.

Eric_Williams · October 3, 2023, 5:30am

I’m frankly shocked that works. If anything, it should be:

pdfText = ReplaceAll(timeText, Encodings.UTF8.Chr(&hE280AF), " ")

Does that work?

Mark_Franken · October 3, 2023, 5:42am

No. I tried Encodings.UTF8 first as well but that doesn’t work. I spent a good hour trying to get it to work, then by mistake tried Encodings.ASCII and that works.

Eric_Williams · October 3, 2023, 6:03am

What is the encoding of the source string?

Mark_Franken · October 3, 2023, 6:09am

UTF8. Created with the following code

Var d1 As DateTime = DateTime.Now
Var DateStrNew As String = d1.ToString(Locale.Current, DateTime.FormatStyles.None, DateTime.FormatStyles.Short)

kevin_g · October 3, 2023, 6:50am

That wouldn’t work.

Encodings.UTF8.Chr expects the Unicode code point but you are passing the UTF8 byte sequence.

I think you would need to pass &h202F.

Mark_Franken · October 3, 2023, 7:48am

Many thanks, the following works on macOS Sonoma:
pdfText = ReplaceAll(timeText, Encodings.UTF8.Chr(&h202F), " ")

http://www.unicode-symbol.com/u/202F.html

Eric_Williams · October 3, 2023, 1:25pm

Right! Thank you.

Christian_Schmitz · October 3, 2023, 3:31pm

We look into helping for DynaPDF by doing automatic replacements.

Like if the text contains a non-breaking space, but the current font doesn’t have such a character, we could just use a normal space instead.

Eric_Williams · October 3, 2023, 4:09pm

I hope the DynaPDF engine would respect the non-breaking aspect of the character…?

TimStreater · October 3, 2023, 5:37pm

Yes, that’s a Unicode character in UTF-8 encoding. I would have expected that any app producing a PDF would understand about UTF-8. Otherwise how is it going to handle the many symbols not in ASCII? ?

Eric_Williams · October 3, 2023, 6:48pm

A quick search of the forum will reveal that Unicode support in PDFs is a work in progress at Xojo.

Christian_Schmitz · October 3, 2023, 8:22pm

Yes, it does the non-breaking part.

But then when it comes for the rendering of the character, it looks into the current font to look how to draw it. and yes, some font could define to draw a symbol there!

If the font doesn’t have the character, it reports an error. And that part is something Jens may change.