.svg doesn't open when chars in wrong encoding are present

Bug #482451 reported by Patrick Storz
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Inkscape
Won't Fix
Wishlist
Jon A. Cruz

Bug Description

When wrong encoding is set in header, a .svg file isn't opened at all, even if just one char is missing from selected charset.

Why don't display a box or some other placeholder like e.g. Firefox does when opening the "example.svg" in the attachment?
By simply changing encoding from "utf-8" to "iso-8859-15" in the .svg header the problem is solved.

Tags: encoding text
Revision history for this message
Patrick Storz (ede123) wrote :
Revision history for this message
su_v (suv-lp) wrote :

- Batik 1.7 (OS X 10.5.8) doesn't open it either:
  SVG Error: Invalid byte 1 of 1-byte UTF-8 sequence.
- Inkscape 0.46+devel r22575 (OS X 10.5.8): console message attached
  complains about other errors besides the encoding.

tags: added: text
removed: charset
Revision history for this message
jazzynico (jazzynico) wrote :

And it doesn't validate (http://validator.w3.org/check) unless you force the validator to use an appropriate encoding.

Is Inkscape expected to find out the correct encoding in this case?

Wishlist or invalid?

Changed in inkscape:
importance: Undecided → Wishlist
Revision history for this message
Jon A. Cruz (jon-joncruz) wrote :

As to why Inkscape should not display a placeholder box, etc.
Firefox is a web *browsing* rendering agent, who's job it is to be forgiving and try to slow a closest approximation of what it can take guesses at. Inkscape is an SVG *creation* tool, who's job it is to create compliant SVG files.

Inkscape can't know what the proper encoding is. So if Inkscape read in such a file, it would be corrupting data and then saving out that corrupted data. In this case you the human were required to investigate the file and apply an intelligent decision on how to compensate for an error. We don't really have a general AI to solve that for all cases.

Of course, there is a question of whether or not setting things to "iso-8859-15" actually "solved" the problem or if it just happened to create a work-around. Should it have been "iso-8859-1", "iso-8859-7", "MacRoman", "Cp1252" or something else?

A better solution would be for the end user to determine where exactly things were corrupted earlier in the workflow. Something mangled the file, and stopping that would probably be much better in the long run.

Changed in inkscape:
assignee: nobody → Jon A. Cruz (jon-joncruz)
status: New → Won't Fix
Revision history for this message
Jon A. Cruz (jon-joncruz) wrote :

There might be other things that Inkscape should do in this situation, but randomly corrupting arbitrary characters is not one of them.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.