sxw2rml cannot support for Simplified Chinese Version OpenOffice 1.0 document

Bug #787908 reported by mrshelly
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Odoo Server (MOVED TO GITHUB)
Confirmed
Wishlist
OpenERP's Framework R&D

Bug Description

I used the following python script convert sxw(OpenOffice 1.0 document) to rml document.

<pre>

import zipfile,sys
from pyopenoffice import PyOpenOffice
import StringIO
from lxml import etree
import xml.dom.minidom

import libxslt
import libxml2

fname = r'c:\test.sxw'
xsl_file = './normalized_oo2rml.xsl'
z = zipfile.ZipFile(fname, 'r')
mimetype = z.read('mimetype')
if mimetype.split('/')[-1] == 'vnd.oasis.opendocument.text' :
    xsl_file = './normalized_odt2rml.xsl'

xsl = file(xsl_file).read()
tool = PyOpenOffice('.', save_pict = False)
sxw_file = fname
res = tool.unpackNormalize(sxw_file)

styledoc = libxml2.parseDoc(xsl)

style = libxslt.parseStylesheetDoc(styledoc)
doc = libxml2.parseMemory(res,len(res))
result = style.applyStylesheet(doc, None)
print result

</pre>

There are some bug of minidom python extended, and I fixed it.

@tiny_sxw2rml.pdf (5.x) or @openerp_sxw2rml.pdf I found the code...

<pre>
        styles_styles = self.styles_dom.getElementsByTagName("style:style")

</pre>

I fixed it like :

<pre>
        ....
        styles_styles = []
        styles_styles = styles_styles + self.styles_dom.getElementsByTagName("style:style")
        styles_styles = styles_styles + self.styles_dom.getElementsByTagName("style:font-decl")
        ....
</pre>

and some trouble with "content_styles" variable...

@normalized_oo2rml.xsl document. I found the code:

<pre>
<xsl:when test="not($fontName='') and boolean($fontName)">

....
    <xsl:when test="contains($fontName,'Courier')">

...
    <xsl:when test="contains($fontName,'Helvetica') or contains($fontName,'Arial') or contains($fontName,'Sans')">

...
    <xsl:otherwise> <-------------------- Otherwise 1

...
<xsl:otherwise> <-------------------- Otherwise 2
...
</pre>

In Simplified Chinese Version OpenOffice 1.0 document, The "fontName" is "宋体", "黑体".
I found in my "test.sxw" file, the normalized_oo2rml.xsl match the "Otherwise 2", the sxw file's "宋体" fontName be replaced with "Times-Roman"..

Then, How to fixed it and add docini/registerFont node to generated rml file. order to let OpenERP to support the Simplified Chinese Version OpenOffice 1.0 document can be convert to rml file.

Thanks...

mrshelly
2011/05/25

Revision history for this message
mrshelly (mrshelly) wrote :
Changed in openobject-server:
assignee: nobody → OpenERP's Framework R&D (openerp-dev-framework)
importance: Undecided → Wishlist
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.