Asian characters don't render in PDF reports

Bug #830831 reported by technick on 2011-08-22
26
This bug affects 5 people
Affects Status Importance Assigned to Milestone
SchoolTool
High
Douglas Cerna
schooltool (Ubuntu)
Undecided
Unassigned

Bug Description

PDF export of asian characters doesn't work correctly. This has been verified with Hangul (Korean), Japanese, and Chinese text.

The PDF seems to have the correct text data, but all asian characters render as square boxes in the PDF.
** You can copy the text out of the PDF and paste it in a text editor, so this looks more like a font issue.

See the attached screen-shot.

To reproduce:
1) Add three students with names 이름 (Korean), 名前 (Japanse), 名称 (Chinese).
2) Save a PDF report that includes these students (ex: create some absenses in a section for these students and view the absences for section report)

technick (nickfolse-gmail) wrote :
Douglas Cerna (replaceafill) wrote :

Correct, this is a font issue. We only embed Liberation* fonts in the PDF. To display other languages correctly we would need to embed the appropriate fonts based maybe on the request locale or the language setting. I couldn't find a dynamic way to do it while working with Khmer.

Tom Hoffman (tom-hoffman) wrote :

We'll have to come up with some kind of strategy for this, even if it is just creating separate language packs for reports with different character sets.

technick (nickfolse-gmail) wrote :

As a temporary workaround I added the following mapping to my app/pdf.py file

# pdf.py
font_map_kr = {'Arial_Normal': 'UnDotum.ttf',
            'Arial_Bold': 'UnDotumBold.ttf',
            'Arial_Italic': 'UnDotum.ttf',
            'Arial_Bold_Italic': 'UnDotumBold.ttf',
            'Times_New_Roman': 'UnBatang.ttf',
            'Times_New_Roman_Bold': 'UnBatangBold.ttf',
            'Times_New_Roman_Italic': 'UnBatang.ttf',
            'Times_New_Roman_Bold_Italic': 'UnBatangBold.ttf'}

font_map = font_map_kr

Then I updated school_tool.conf to point to the unfonts directory:
reportlab_fontdir /usr/share/fonts/truetype/unfonts

I believe these are the Korean fonts installed in Ununtu when you enable Korean language input.

Changed in schooltool:
status: New → Confirmed
importance: Undecided → High
Tom Hoffman (tom-hoffman) wrote :

We need a strategy for this. It is ok if it takes the user a few steps -- that's much better than just throwing up our hands.

Gediminas Paulauskas (menesis) wrote :

The font map (see Comment #4) could be moved to schooltool.conf

Also make PDFs use only one font, either serif or sans-serif, not both. Will make it simpler.

Changed in schooltool:
assignee: nobody → Gediminas Paulauskas (menesis)
status: Confirmed → Triaged
milestone: none → 2.1
Tom Hoffman (tom-hoffman) wrote :

OK... is that the plan? Objections?

There's still a minor problem with word-wrap. It needs to be set to CJK when using these fonts. Mixed case (CJK + non-CJK text) is not handled by default in reportlab, IIRC.

Set... where?

2011/11/29 Justas Sadzevičius <email address hidden>:
> There's still a minor problem with word-wrap.  It needs to be set to CJK
> when using these fonts.  Mixed case (CJK + non-CJK text) is not handled
> by default in reportlab, IIRC.
>
> --
> You received this bug notification because you are a member of
> SchoolTool Owners, which is subscribed to SchoolTool.
> https://bugs.launchpad.net/bugs/830831
>
> Title:
>  Asian characters don't render in PDF reports
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/schooltool/+bug/830831/+subscriptions
>

Somewhere in our code. Because of some setting we're yet to implement. Or maybe there's a workaround.

Changed in schooltool:
milestone: 2.1.0 → next
Lyle Kozloff (lkozloff) wrote :

Uninformed opinion follows - I haven't looked at the code at all.

Rather than exporting to PDF where we need to worry a lot about fonts, what about styling the reports with print-specific CSS? It's reasonable to assume that if someone is using the site in an Asian language (or other complex script) that it's already rendering properly in their browser.

It seems like it would be easier for a layman to style the reports and take care of the font-embedding problem.

Tom Hoffman (tom-hoffman) wrote :

Yes, that is certainly possible. In general we haven't emphasized the printable web page route because for more formal reports (like ones that are sent home) you really need to be able to do proper page layout, but there is also a place for easier/less formal tabular reports which might be good enough just with print CSS.

Changed in schooltool:
assignee: Gediminas Paulauskas (menesis) → Douglas Cerna (replaceafill)
Tom Hoffman (tom-hoffman) wrote :

Thanks for the suggestion, we'll take a look at weasyprint, especially since it is written in Python. The good thing is we wouldn't need do make an either/or decision. Perhaps we'll start making some new things with it, we wouldn't literally have to ditch ReportLab.

In practice it might be less of a clear win than you'd think, because for formal reports we'd also have to learn a lot about, say, the CSS Paged Media Module, which may end up being as complicated as ReportLab.

But yes, we'll check it out.

Tom Hoffman (tom-hoffman) wrote :

Actually, the biggest problem might be the lack of an Ubuntu package, which we'd have to make ourselves.

nedosa (nedosa) wrote :

Could it not be installed as python dependency in schooltool's setup.py ?

As for its usage, you're absolutely right, paging can be a pain, but I find working with CSS - a declarative language designed for layout - more amenable to the manual manipulation of page blocks as in Reportab.

Tom Hoffman (tom-hoffman) wrote :

*Python* packaging isn't a problem (eggs, setup.py), but we'd have to have Ubuntu packaging ourselves.

Tom Hoffman (tom-hoffman) wrote :

That is, we'd have to do the Ubuntu packaging ourselves.

Changed in schooltool:
status: Triaged → Opinion
description: updated
Changed in schooltool:
status: Opinion → Triaged
Changed in schooltool:
milestone: next → none
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in schooltool (Ubuntu):
status: New → Confirmed
Ross Gammon (rosco2) on 2015-03-01
affects: ubuntu → schooltool (Ubuntu)
Changed in schooltool (Ubuntu):
status: New → Confirmed
Daniel Owens (dh-owens) wrote :

This problem also affects Vietnamese. I solved it by following the above workaround and using DejaVu Sans. But upgrading Schooltool erases such modifications. Is there a way that substitute fonts that render correctly in PDF export could be added through the web interface?

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers