[Upstream] Writer mis-displays greek letters in symbol font importing docx

Bug #815983 reported by Cesar Avila
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
LibreOffice
Invalid
Medium
libreoffice (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

1) lsb_release -rd
Description: Ubuntu 11.04
Release: 11.04

2) apt-cache policy libreoffice-writer
libreoffice-writer:
  Installed: 1:3.3.3-1ubuntu2
  Candidate: 1:3.3.3-1ubuntu2
  Version table:
 *** 1:3.3.3-1ubuntu2 0
        500 http://us.archive.ubuntu.com/ubuntu/ natty-proposed/main i386 Packages
        100 /var/lib/dpkg/status
     1:3.3.2-1ubuntu5 0
        500 http://us.archive.ubuntu.com/ubuntu/ natty-updates/main i386 Packages
     1:3.3.2-1ubuntu4 0
        500 http://us.archive.ubuntu.com/ubuntu/ natty/main i386 Packages

3) What is expected to happen importing a MS Word 2008 for Mac .docx in LO Writer via the Terminal:

cd ~/Desktop && wget -c https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/815983/+attachment/2238809/+files/TestOffice2008.docx -O info.docx && lowriter -nologo info.docx

is the alpha and beta symbols show as it does in MS Word 2003 (11.5604.6505) or MS Word 2008 for Mac.

4) What happens instead is it does not display the symbols correctly.

Original Reporter Comments: I have opened the document on another machine using OpenOffice and save it as ODT. When I open this version, the fonts are correctly assigned to Symbol, and the greek letters do show up.

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: libreoffice (not installed)
ProcVersionSignature: Ubuntu 2.6.38-10.46-generic 2.6.38.7
Uname: Linux 2.6.38-10-generic x86_64
Architecture: amd64
Date: Mon Jul 25 12:46:57 2011
InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Release amd64 (20110427.1)
ProcEnviron:
 LANGUAGE=es_AR:en
 LANG=es_AR.UTF-8
 SHELL=/bin/bash
SourcePackage: libreoffice
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
penalvch (penalvch) wrote :

Cesar Avila, thank you for reporting this bug and helping make Ubuntu better. Could you please attach the file that demonstrates this problem?

Changed in libreoffice (Ubuntu):
status: New → Incomplete
Revision history for this message
Cesar Avila (clavila) wrote :

Hi Christopher,
sorry for the delay, I was running some further tests. In this way I could check that indeed there is no problem while importing MS Word 97 documents including symbols. The problem arises just with those written in MS Word 98 for Mac. Attached you can find a test file to check whether you are able to visualize the symbols (some greek letters) or not. I think that in the original file, the font for this letters is set to Symbol, which is preserved on the OpenOffice import filter. On the contrary, Libreoffice would reset the font to the default one.

Revision history for this message
penalvch (penalvch) wrote :

Cesar Avila, thanks for reporting this bug and any supporting documentation. Since this bug has enough information provided for a developer to begin work, I'm going to mark it as Triaged and let them handle it from here. Thanks for taking the time to make Ubuntu better!

description: updated
tags: added: lo33
Changed in libreoffice (Ubuntu):
importance: Undecided → Medium
status: Incomplete → Triaged
Revision history for this message
In , penalvch (penalvch) wrote :

Created attachment 49722
TestOffice2008.docx

Downstream bug may be found at:
https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/815983

1) lsb_release -rd
Description: Ubuntu 11.04
Release: 11.04

2) apt-cache policy libreoffice-writer
libreoffice-writer:
  Installed: 1:3.3.3-1ubuntu2
  Candidate: 1:3.3.3-1ubuntu2
  Version table:
 *** 1:3.3.3-1ubuntu2 0
        500 http://us.archive.ubuntu.com/ubuntu/ natty-proposed/main i386 Packages
        100 /var/lib/dpkg/status
     1:3.3.2-1ubuntu5 0
        500 http://us.archive.ubuntu.com/ubuntu/ natty-updates/main i386 Packages
     1:3.3.2-1ubuntu4 0
        500 http://us.archive.ubuntu.com/ubuntu/ natty/main i386 Packages

3) What is expected to happen importing a MS Word 2008 for Mac .docx in LO Writer via the Terminal:

cd ~/Desktop && wget -c https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/815983/+attachment/2238809/+files/TestOffice2008.docx -O info.docx && lowriter -nologo info.docx

is the alpha and beta symbols show as it does in MS Word 2003 (11.5604.6505) or MS Word 2008 for Mac.

4) What happens instead is it does not display the symbols correctly.

summary: - Libreoffice fails importing docx files with symbol fonts
+ [Upstream] Writer mis-displays greek letters in symbol font importing
+ docx
Revision history for this message
In , Iamtester8 (iamtester8) wrote :

Not Reproduced with:

LO 3.4.2 OOO340m1 (Build:202)
Ubuntu 10.04.3 x86
Linux 2.6.32-33-generic Russian UI

Can you check it on LO3.4?

Changed in df-libreoffice:
importance: Unknown → Medium
status: Unknown → Confirmed
Revision history for this message
In , penalvch (penalvch) wrote :

tester8, confirmed fixed in LibreOffice 3.4.2 OOO340m1 (Build:203), Microsoft Windows Vista Business 6.0.6002 Service Pack 2 Build 6002. Marking RESOLVED WORKSFORME.

Changed in df-libreoffice:
status: Confirmed → Invalid
Revision history for this message
penalvch (penalvch) wrote :

Cesar Avila, I am closing this bug because it has been fixed in the latest development version of Ubuntu - Oneiric Ocelot.

This is a significant bug in Ubuntu. If you need a fix for the bug in previous versions of Ubuntu, please do steps 1 and 2 of the SRU Procedure [1] to bring the need to a developer's attention.

[1]: https://wiki.ubuntu.com/StableReleaseUpdates#Procedure

lsb_release -rd
 Description: Ubuntu oneiric (development branch)
 Release: 11.10

 apt-cache policy libreoffice-writer
 libreoffice-writer:
 Installed: 1:3.4.2-2ubuntu2
 Candidate: 1:3.4.2-2ubuntu2
 Version table:
 *** 1:3.4.2-2ubuntu2 0
 500 http://us.archive.ubuntu.com/ubuntu/ oneiric/main i386 Packages
 100 /var/lib/dpkg/status

Changed in libreoffice (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
In , Wugs (wugs) wrote :

Sorry, but I have to open this bug again:
It is still or again reproducible on MacOS X (10.6.8). Testing with LibreOffice 3.5.3.2 (Build-ID: 235ab8a-3802056-4a8fed3-2d66ea8-e241b80), German langpack installed.

If I open the attached sample .docx file, the two mentioned Greek letters alpha and beta are not displayed, but some ornaments are visible instead. This happens even with Microsoft's Cambria + Cambria Math fonts installed.

I will attach a screenshot of how the document looks for me.

Revision history for this message
In , Wugs (wugs) wrote :

Created attachment 61208
Screenshot of the sample document in LO 3.5.3.2 on MacOS X 10.6.8

Revision history for this message
In , Wugs (wugs) wrote :

If I don't miss anything, the problematic DOCX section (from /word/document.xml) is:

<w:r><w:sym w:font="Symbol" w:char="F061"/></w:r><w:r><w:t xml:space="preserve">-alpha, </w:t></w:r><w:r><w:sym w:font="Symbol" w:char="F062"/></w:r><w:r><w:t>-beta.

I am no DOCX expert, but if I understand Microsoft's horrible file format right I see two interesting points:

* The w:font attribute is "Symbol" (not Cambria/Cambria Math as I would expect).
* The w:char has the value F061. If this is a Unicode code point it means that the two symbols alpha and beta are not Greek Unicode letters (would be U+03B1 and 03B2) nor from some math symbols range, but glyphs from the Private Use Area.

This is indeed strange. If Microsoft wants an alpha Glyph from the Symbol font, it should just use the Greek Unicode indices, which are correctly U+03B1 and 03B2, at least in my copy of the Symbol font. Also my copy of the Cambria Italic contain alpha and beta at the correct Unicode indices (I can't test Cambria Regular because it's a .TTC file which FontLab does not open). MS should not rely on PUA glyphs for important things like formula symbols. And there is just no U+F061 or F062 glyph in the Symbol font installed with MacOS X 10.6.8 ... (why should there be one?!).

Therefore, I'm not surprised about the two ornaments visible in the screenshot: they are just the glyphs associated with U+F061 and U+F062 in some font I have installed (Apple Chancery in my case). This is correct behaviour if the font used for the text does not contain any glyph associated with this Unicode code point.

But, what is really important: even if I blame MS for doing strange things, there is still a problem in LibreOffice, at least in the MacOS version. If the sample file looks right on Windows, there seems to be some mapping from the strange w:char="F061" to the right alpha Glyph. Therefore, we just need the same mapping to work on MacOS, too. (Or there are indeed U+F061 and F062 glyphs in the Symbol (or Cambria) font on Windows. Can someone tell us if this is true? But even if this is true, we need again some mapping of these glyphs on MacOS in order to display them correctly.)

Revision history for this message
In , penalvch (penalvch) wrote :

Roman Eisele, please do not reopen this report. This report is about how a bug in 3.3 was fixed in the 3.4 branch. However, a regression has occurred between the 3.4 and 3.5 branch, which is a different bug. If you are having a problem in LibreOffice, please file a new report. Thank you for your understanding.

Revision history for this message
In , Wugs (wugs) wrote :

(In reply to comment #6)
> Roman Eisele, please do not reopen this report. This report is about how a bug
> in 3.3 was fixed in the 3.4 branch. However, a regression has occurred between
> the 3.4 and 3.5 branch, which is a different bug. If you are having a problem
> in LibreOffice, please file a new report. Thank you for your understanding.

Thanks for your friendly advice. You forgot to reset the Platform picker.

Revision history for this message
In , Wugs (wugs) wrote :

Hint: For the issue(s) I reported in comment #3 to comment #5, there is now the new bug 49645.

If you encounter some similar issues with LibreOffice 3.5 (or 3.6), please don't reopen this (present) bug report bug but refer to bug 49645 instead.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.