PDF generation status using UTF8

I use reports quite intensively in my projects. It is important for me to use Polish special characters and it is not a problem when printing. If I want a PDF document, I simply capture the moment of printing and from the user interface I choose to save the printout to PDF and everything is fine.

Trying to print directly to PDF in Xojo works, but of course UTF8 - doesn’t work. Existing ‘rpt’ is a report while ‘rs’ is RecordSet.

var dpdf as new PDFDocument
var g as Graphics
g=dpdf.Graphics

if rpt.Run(rs,dpdf) then
  rpt.Document.Print(g)
  var f as FolderItem = SpecialFolder.Documents.Child("PDFpreview.pdf")
  dpdf.Save(f)
end if

Unfortunately, I have many such reports in the project and I know that there are plugins that correctly generate PDFs with UTF8. But do any of these plugins allow me to capture such report printing directly to PDF? Otherwise, I will have a lot of work to recreate all these reports in a new way, which I definitely would not want and would not have to do if Xojo’s PDFDocument worked correctly with UTF8 at first place…

1 Like

For our DynaPDFMBS class, we provide a PageGraphics object to allow reports to PDF.

See sample project here

2 Likes
if rpt.Run(rs,dpdf) then

is the datasource table fields value in utf-8?

yes, they are. But even text labels on report page, when using UTF8 characters - are bad when generated into PDF.

Do you explicitly tell rs that the contents are UTF8? Otherwise, they might actually be something else.

Utf8 characters in Xojo PDF is a known issue for some accentuated characters and non Roman languages:

Output should be

‣ Accents éêà
 • Cçćč
 • Eèéêëėę
 • Iìíîïı
 • Nñń
 • Oòóôõöøōœ
 • Sßśš
 • Uùúûü

And this is the Output for Ukrainian / Russian

1 Like

it is exactly good presentation what the problem is. Even text labels put on report page, when containing special characters - are ??? when PDF gets generated…

Here are some open issues about this problem:

https://tracker.xojo.com/xojoinc/xojo/-/issues/73611

https://tracker.xojo.com/xojoinc/xojo/-/issues/60248

1 Like
Var d As New PDFDocument
d.Language = "pl-PL"
Var g As Graphics = d.Graphics
g.FontSize = 20
g.FontName = "Helvetica"

// utf-8
var t As String = "ą, ć, ę, ł, ń, ó, ś, ź, ż"

// convert to 1 byte only extended charset
// trying to circumvent the 1 byte charset limit of Xojo PDFDocument
t = t.ConvertEncoding(Encodings.ISOLatin2)

g.DrawText(t, 20, 60, 400)

d.Save(SpecialFolder.Desktop.Child("test.pdf"))

// Still wrong

quit

I believe this could be done using the current limits. @Javier_Menendez Am I wrong?

i know that DrawText is working but it cannot be used to render existing reports:

  • no use in labels’ text
  • no use in database fields…

the only way is to rework all reports to “everything is based on DrawText” which is something I don’t want to do…

It’s not working. Check my sample.

Once Xojo could make it work using a Latin2 charset (Polish compatible), changing the report converting the string sources on the fly to latin2 could solve your problem.

AFAIR g.drawtext into image works, when put onto report. But it is still exporting image to PDF rather than nice and clean PDF text.

Yes, into a bitmap yes, because it’s not writing texts, inserting “letters” as in a document as PDF, but drawing graphics, setting bits on/off on a bitmap.

PDF can handle UTF-8 if fully implemented, Xojo version only handles maps of 255 chars per charset, so many symbols can’t be translated to whatever map Xojo uses (Latin1?). If we could “define” that map to latin2 to your case, it could solve your presentation problem. “drawing text” in a PDF is just inserting a text object into some place of the document, not drawing bits. That’s how your report does it behind the scenes.

I set PDF.Language to “pl-PL”
I put a label on the report and set it’s text properties (in BeforePrint event) to text containing Polish character after ConvertingEncoding to ISOLatin2. no change, “???” instead of Polish characters.. :frowning:

I think you are not reading what I wrote. The summary is:

Xojo PDF currently does not support Polish.

My post:

asks to Javier at Xojo to see if he can implement a way of matching the internal Xojo map to the user charset, so Latin2 would solve your case.

Again, currently, Xojo PDF is not fully international compatible, and affects a lot of Asian, African and Eastern Europe languages, including Polish.

1 Like

I didn’t read this fragment carefully enough, you actually have a right conclusion.

1 Like

The PDF output needs to be able to cope with anything UTF8 can throw at it in one single document at the same time. My application can do that, and Apple’s Save as PDF can do that. The PDF class needs to be able to cope with it.

1 Like