Extensions fail with the big files (XML parser)

Bug #1285592 reported by Nizamov Shawkat
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Inkscape
Fix Released
Medium
Unassigned

Bug Description

Hi, when working with a big (complex) file the "Modify path - > Color markers to match stroke" fails with the following error message. I am trying to do a color arrow, and the arrow head retains its color (remains black).

Traceback (most recent call last):
  File "markers_strokepaint.py", line 78, in <module>
    e.affect()
  File "/usr/share/inkscape/extensions/inkex.py", line 211, in affect
    self.parse()
  File "/usr/share/inkscape/extensions/inkex.py", line 139, in parse
    self.document = etree.parse(stream)
  File "lxml.etree.pyx", line 3197, in lxml.etree.parse (src/lxml/lxml.etree.c:64726)
  File "parser.pxi", line 1593, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:92577)
  File "parser.pxi", line 1624, in lxml.etree._parseFilelikeDocument (src/lxml/lxml.etree.c:92916)
  File "parser.pxi", line 1506, in lxml.etree._parseDocFromFilelike (src/lxml/lxml.etree.c:91779)
  File "parser.pxi", line 1069, in lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:88819)
  File "parser.pxi", line 577, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:84019)
  File "parser.pxi", line 676, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:85122)
  File "parser.pxi", line 616, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:84445)
lxml.etree.XMLSyntaxError: internal error: Huge input lookup, line 10964, column 122564

Revision history for this message
su_v (suv-lp) wrote :

Please provide information about OS/platform and Inkscape version (see Inkscape menu 'Help > About Inkscape').

To ease further investigation of the reported issue, please attach a sample SVG file which consistently crashes on your system when trying to color markers to match stroke colors, and provide instructions for what to select (and how).

Changed in inkscape:
status: New → Incomplete
tags: added: extensions-plugins markers
Revision history for this message
Filip Krška (fill-io) wrote :

Hi, I got similar Traceback when trying to launch Sozi plugin (and I'm not alone, see https://groups.google.com/forum/#!topic/sozi-users/cgzxreBAvZU) on large SVG file (28MB, shall I attach it?).

I've workaround my problem with this patch, inspired by http://making-security-measurable.1364806.n2.nabble.com/libxml2-lxml-Parsing-large-9-5mb-XML-Documents-td7580595.html :

--- /usr/share/inkscape/extensions/inkex.py 2014-07-05 18:47:30.112018760 +0200
+++ /tmp/index.py.bak 2014-07-05 18:48:54.045404293 +0200
@@ -136,7 +136,7 @@
                 stream = open(self.svg_file,'r')
         except:
             stream = sys.stdin
- self.document = etree.parse(stream, parser=etree.XMLParser(huge_tree=True))
+ self.document = etree.parse(stream)
         stream.close()

     def getposinlayer(self):

However, I'm aware, that the huge_tree=False is reasonable default to prevent XML DoS, so I suggest to implement it as an option, so only people set it on consciously. Also catching the Exception and upon that informing user about the option would be fine.

Revision history for this message
Filip Krška (fill-io) wrote :

One more remark, huge_tree=True was introduced in libxml2-2.9.0, hence the dependency should be required.

jazzynico (jazzynico)
tags: removed: markers
Changed in inkscape:
status: Incomplete → Confirmed
summary: - Matching color markers fails in the big file
+ Extensions fail with the big files (XML parser)
Revision history for this message
su_v (suv-lp) wrote :

Related change (huge_tree=True) already exists in current stable (0.48.5) and trunk (0.91pre2) for bug #1217602:
<http://bazaar.launchpad.net/~inkscape.dev/inkscape/RELEASE_0_48_BRANCH/revision/9958#share/extensions/inkex.py>
<http://bazaar.launchpad.net/~inkscape.dev/inkscape/trunk/revision/12526#share/extensions/inkex.py>

Neither the reporter nor later commenters have been able to provide the requested information about the inkscape version used, nor was a sample file attached - is this report still an issue with current (latest) stable 0.48.5?

Changed in inkscape:
status: Confirmed → Incomplete
Revision history for this message
Nizamov Shawkat (nizamov-shawkat) wrote :

Hi,

I was using at that moment the Inkscape 0.48.4, supplied with Ubuntu 13.10. Now I am using Inkscape 0.48.4 r9939, supplied with Ubuntu 14.04, and I can not reproduce the bug. I try the same file (or one of the same files) which I was working on when reporting this bug. Unfortunately, I can not post it due to its content. I think that the crash had more to do with libxml than with inkscape and it is actually fixed at this moment. At least, it works for me now.

Best regards,

Revision history for this message
jazzynico (jazzynico) wrote :

Ubuntu 13.10 and Ubuntu 14.04 provide almost the same Inkscape package (0.48.4 + Ubuntu specific changes, but nothing related to the huge_tree parameter), and thus the issue was probably not related to bug #1217602.
Maybe something in libxml indeed (2.9.1+dfsg1-3ubuntu2.3 in Ubuntu 13.10 and 2.9.1+dfsg1-3ubuntu4.3 in Ubuntu 14.04).

Proposing to close "invalid".

Revision history for this message
su_v (suv-lp) wrote :

New report (presumably Inkscape 0.48.4 (not specified), on Ubuntu 14.04) with what looks like the same error (lxml):
- Bug #1375981 “new version of inkscape fails on svg files with embedded png (lxml error)”
  <https://bugs.launchpad.net/inkscape/+bug/1375981>

Revision history for this message
Nizamov Shawkat (nizamov-shawkat) wrote :

Hi,

I am using Inkscape 0.48.4 r9939, supplied with Ubuntu 14.04. I was not able to reproduce this bug with my old files, but with the file posted in https://bugs.launchpad.net/bugs/1375981 the bug is reproducible. Steps to reproduce: draw a coloured line, set arrow, try to match the arrow color by using "extension - modify path - color markers to match stroke"

The errror message seems to be also the same as in https://bugs.launchpad.net/bugs/1375981. So these should be duplicate.

Used file: https://cloud.laas.fr/public.php?service=files&t=c3a383ee4f25c76d023c1433464b6602

Error message:

Traceback (most recent call last):
  File "markers_strokepaint.py", line 78, in <module>
    e.affect()
  File "/usr/share/inkscape/extensions/inkex.py", line 211, in affect
    self.parse()
  File "/usr/share/inkscape/extensions/inkex.py", line 139, in parse
    self.document = etree.parse(stream)
  File "lxml.etree.pyx", line 3239, in lxml.etree.parse (src/lxml/lxml.etree.c:69955)
  File "parser.pxi", line 1769, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:102257)
  File "parser.pxi", line 1789, in lxml.etree._parseFilelikeDocument (src/lxml/lxml.etree.c:102516)
  File "parser.pxi", line 1684, in lxml.etree._parseDocFromFilelike (src/lxml/lxml.etree.c:101442)
  File "parser.pxi", line 1134, in lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:97069)
  File "parser.pxi", line 582, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:91275)
  File "parser.pxi", line 683, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:92461)
  File "parser.pxi", line 622, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:91757)
lxml.etree.XMLSyntaxError: internal error: Huge input lookup, line 13499, column 6031160

Revision history for this message
BoZ (boz-ubuntuoneleavemealone) wrote :

Hi,

here are additional informations about the bug reported in https://bugs.launchpad.net/bugs/1375981 and that clearly looks like a duplicate of this one:
The bug occurs with 0.48.4 r9939 (Jan 22 2014) under Ubuntu 14.04 and does not occur with 0.48.3.1 r9886 (Jan 29 2013) under Ubuntu 12.04.
If you need more informations or testing, just let me know...

BoZ

Revision history for this message
su_v (suv-lp) wrote :

Reproduced by two users with 0.48.4 on Ubuntu 14.04, setting status to 'Confirmed'.

Not reproduced based on the latest 'Steps to reproduce' with current stable Inkscape 0.48.5 on OS X 10.7.5 (lxml 3.3.5, libxml2 2.9.1). Will test with 0.48.4 and 0.48.5 on Ubuntu 14.04 (VM) later and report back.

Changed in inkscape:
importance: Undecided → Medium
status: Incomplete → Confirmed
Revision history for this message
su_v (suv-lp) wrote :

On Ubuntu 14.04 (VM, 64bit), using the 'Steps to reproduce' from comment #8:
- reproduced with Inkscape 0.48.4 (official package)
- not reproduced with Inkscape 0.48.5 (Inkscape stable PPA)

In my understanding, this is related to (or about) the same problem with changes in newer libxml2 versions as tracked in bug #1217602 (and addressed for Inkscape in the current bug-fix release 0.48.5) - see comments #2 and #4.

Revision history for this message
jonh (jonh-launchpad) wrote :

Here's a report of a repro of both bug and workaround.

I'm using Inkscape 0.48.4 r9939, stock on Ubuntu 14.04LTS. I observed this problem running the Visualize/Measure Path extension. The culprit file has a layer (hidden, but apparently Inkscape ships the *entire* SVG out to extensions, not just what's selected, or even what's visible?) with some raster images on it. The failure dialog pointed directly at a line in the extension intermediate file in /tmp that begins xlink:href="data:image/png;base64,... and continues with about 1.9MB of base64 characters.

I transplanted measure.py and inkex into my ~/.config/inkscape/extensions directory, renamed them, and modified line 139 to say:
        self.document = etree.parse(stream, parser=etree.XMLParser(huge_tree=True))
as suggested above, and the problem is solved. (Although I bet the extension would run faster if we weren't sending the whole document out there, but only the selection! Perhaps a flag in the .inx file could tell inkscape that only the selected bits are needed?)

Revision history for this message
Claudio Pacchierotti (cpacchierotti) wrote :

This bug affects me as well.
The workaround provided by jonh worked.

Revision history for this message
su_v (suv-lp) wrote :

The XML parser argument added in the fix for bug #1217602 ("lxml.etree.XMLSyntaxError: Excessive depth in document") also prevents the parser error "lxml.etree.XMLSyntaxError: internal error: Huge input lookup" reported here.

Changed in inkscape:
milestone: none → 0.48.5
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.