Inkscape >= 0.48.5 loads file despite parser error "XML declaration allowed only at the start of the document" (rev >= 12510)

Bug #1508758 reported by su_v
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Inkscape
Confirmed
Undecided
Unassigned

Bug Description

Symptoms:
1) Inkscape 0.48.5, 0.91 and 0.91+devel apparently changed how parser errors are treated - an invalid XML structure no longer causes Inkscape to abort loading the file (along with appropriate notification of the user).
2) If such a broken XML document was saved with CRLF line endings, apparently another parser error may be triggered later on which can cause parts of the content to be omitted.
3) <specific type="erratic;inconsistent">With recent builds (on OS X), the CRLF-related parser error might not trigger omitted parts of the content if the file is opened from within Inkscape (via 'File > Open Recent')</specific>

Steps to reproduce (variations):
a) open the attached test case in Inkscape <= 0.48.4
--> ok: console messages about parser error, fails to load file
b) open the attached test case in Inkscape 0.48.5
--> console messages about parser error, some text is missing
c) open the attached test case via command line or file manager in Inkscape >= 0.91
--> console messages about parser error, some text is missing
d) open the attached test case from within Inkscape >= 0.91 (via 'File > Open Recent')
--> console messages about parser error, all text is loaded

Console messages:
test.svg:2: parser error : XML declaration allowed only at the start of the document
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
     ^
test.svg:567: parser error : Opening and ending tag mismatch: svg line 5 and text
       id="tspan4995">[one]</tspan></text>
                                          ^
test.svg:568: parser error : Extra content at the end of the document
  <text
  ^

Inkscape should notify the user about the invalid structure of the XML document (empty first line, XML declaration on line 2) and potentially missing content (the second parser error of the test case is apparently triggered by the CRLF line endings - after resaving the file with LF line endings it no longer occurs), and ideally not expose different results depending on how the file was loaded.

Symptoms 1+2 reproduced with Inkscape 0.48.5, 0.91 r13725 and 0.91+devel r14427 on OS X 10.7.5.
Symptom 3 is erratic and does not seem to reproduce consistently (on OS X 10.7.5).

Based on tests with archived builds:
- not reproduced with rev <= 12506,
- reproduced with rev >= 12510;
the observed change in behavior (symptoms 1+2) seems to have been introduced in rev 12510 for bug #166371:
https://bazaar.launchpad.net/~inkscape.dev/inkscape/trunk/changes/12511
https://bazaar.launchpad.net/~inkscape.dev/inkscape/trunk/revision/12510

This bug report and the test case are based on a user's question about the missing text in Inkscape 0.91:
https://answers.launchpad.net/inkscape/+question/272642

Tags: svg
Revision history for this message
su_v (suv-lp) wrote :
Changed in inkscape:
milestone: 0.92 → none
Revision history for this message
jazzynico (jazzynico) wrote :

Reproduced on Windows XP, Inkscape 0.48.5, 0.91 and trunk rev. 14495 (except that no text is missing, but it's a minor difference).

Changed in inkscape:
status: New → Confirmed
Revision history for this message
Patrick Storz (ede123) wrote :

I just pushed a related change in
http://bazaar.launchpad.net/~inkscape.dev/inkscape/trunk/revision/15654
Documents parsed from memory are now also opened with libxml2's XML_PARSE_RECOVER option (r12510 added this for documents parsed from files)

I think this is generally the behavior we'd want, but I agree with su_v that it would be good if we could warn the user in such a case. (The warnings are printed to the console, but I doubt many users would ever see them)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.