Pasting text containing control character creates unreadable SVG

Bug #184904 reported by tbnorth
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Inkscape
Fix Released
Medium
Unassigned

Bug Description

Maybe this is similar to #168152, but that's supposedly fixed, and this is still happening

Text objects somehow got what appeared to be ctrl-K in them, possibly as a result of cut and paste from Publisher in Windows. Files saved ok, but would not load.

PCDATA invalid Char value

was the message from command line use.

I think it's important that Inkscape not create unloadable files, particularly in a case like this where the flaw can be introduced after a lot of work had been done, and there's no warning until the next attempt to load the file, when it's too late.

A popup saying bad chars were stripped from text objects would be much nicer.

Cheers -Terry

Revision history for this message
Ryan Lerch (ryanlerch) wrote :

can you provide which version of inkscape you are using, your operating system and maybe a sample file that exhibits this behaviour? it just makes it a lot easier to test / classify and troubleshoot...

Changed in inkscape:
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
tbnorth (terry-n-brown) wrote :

The bad text was entered on "Mirrored Win32 Build Inkscape0801181659.7z"

The files were unloadable on that version, and Win32 version 0.45.1, and
ubuntu linux version 0.45+devel of Sep. 6 2007

Revision history for this message
Tom Davidson (tjd-mit) wrote :

Confirming that the attached file doesn't load in today's development build under Linux.

As a workaround, you can edit the file by hand (the nice thing about SVG is that it's just XML, which means you can edit it with a text editor)

Changed in inkscape:
status: Incomplete → Confirmed
Revision history for this message
tbnorth (terry-n-brown) wrote : Re: Don't accept text input that will break SVG

It may be helpful to mention the work around in case anyone encounters this issue and needs to, well, work around it, but I just want to clarify that the problem is not that Inkscape can't read the bad xml file (it shouldn't) but that Inkscape created a bad xml file. Probably because of funky clipboard content being pasted into text elements in Windows, but that's no excuse ;-)

Revision history for this message
Tom Davidson (tjd-mit) wrote : Re: [Bug 184904] Re: Don't accept text input that will break SVG

I agree. That's also why I changed the bug summary (title).

If possible, it would help with debugging if you could give
step-by-step instructions for generating a string with bad
control-characters in it (using notepad, emacs, or some other commonly
available software...). I was only able to confirm that your file
didn't load, not that Inkscape actually allows it to happen in a
recent SVN...

On Feb 5, 2008 5:37 AM, tbnorth <email address hidden> wrote:
> It may be helpful to mention the work around in case anyone encounters
> this issue and needs to, well, work around it, but I just want to
> clarify that the problem is not that Inkscape can't read the bad xml
> file (it shouldn't) but that Inkscape created a bad xml file. Probably
> because of funky clipboard content being pasted into text elements in
> Windows, but that's no excuse ;-)
>
> --
> Don't accept text input that will break SVG
> https://bugs.launchpad.net/bugs/184904
> You received this bug notification because you are a member of Inkscape
> Bug Team, which is the bug contact for Inkscape.
>

Revision history for this message
tbnorth (terry-n-brown) wrote :

On Tue, 05 Feb 2008 19:05:48 -0000
Tom Davidson <email address hidden> wrote:

> I agree. That's also why I changed the bug summary (title).

Yes... didn't notice that until after I posted that last comment, sorry
for the noise :-)

> If possible, it would help with debugging if you could give
> step-by-step instructions for generating a string with bad
> control-characters in it (using notepad, emacs, or some other commonly
> available software...). I was only able to confirm that your file
> didn't load, not that Inkscape actually allows it to happen in a
> recent SVN...

Sorry, I gave it a shot just now, grabbed the current windows dev.
build, made an excessively formatted text box in Publisher,
copied it all to a text box in Inkscape, but it didn't produce a
corrupt file.

But that doesn't mean it wouldn't under the circumstances which occured
before, a different Publisher file with different content.

So I don't know if you can do more than check that Inkscape makes a
reasonable attempt to filter bad content from paste operations... if
that's possible.

Cheers -Terry

> On Feb 5, 2008 5:37 AM, tbnorth <email address hidden> wrote:
> > It may be helpful to mention the work around in case anyone
> > encounters this issue and needs to, well, work around it, but I
> > just want to clarify that the problem is not that Inkscape can't
> > read the bad xml file (it shouldn't) but that Inkscape created a
> > bad xml file. Probably because of funky clipboard content being
> > pasted into text elements in Windows, but that's no excuse ;-)
> >
> > --
> > Don't accept text input that will break SVG
> > https://bugs.launchpad.net/bugs/184904
> > You received this bug notification because you are a member of
> > Inkscape Bug Team, which is the bug contact for Inkscape.
> >
>

Revision history for this message
Tom Davidson (tjd-mit) wrote : Re: Don't accept text input that will break SVG

OK, got it, using the gnome character map application 'gucharmap'. Probably any character map utility would work.

1. open gucharmap and create and copy a string containing a ctrl-K (Unicode+000B; LINE TABULATION; vertical tabulation; VT) character
2. New document in inkscape
3. Paste the string into a new text area
4. Save the resulting SVG
5. Inkscape saves without error, but will not reload this document, since the ctrl-k seems to be identified as a line break

Interestingly, if you try to type a ctrl-k using unicode input in inkscape (ctrl-u, 0, 0, 0, B, enter), then you get a 'Non-printable character' message in the status bar, and the ctrl-K is not included. So the logic is there, but is not being applied to pasted text...

Revision history for this message
Tom Davidson (tjd-mit) wrote :

Bulia, are you the right person to assign this to?

Changed in inkscape:
assignee: nobody → buliabyak
Revision history for this message
andreotti (claudiolinux) wrote :

claudio andreotti / diseño grafico / diario página/12 / argentina / buenos aires / 2008 /

<email address hidden>

Inskape es una exelente erramienta de trabajo, que deve tener algunas mejoras para calificar como profecional, pero inskape puede ir mas alla de una erramienta vectorial, inkscape puede cubrir el espacio vacante de las erramientas de autoedicion que scribus "no" cubre.

scribus es muy lento y pesado con una interfaz para la epoca de las cavernas.

inkscape deve desarrollar un plugin que se llame "publusher", que maneje tipografia dentro del campo de texto que ya existe, que tenga hiphenation en varios idiomas y link entre campos de texto.

prove xara y es pesado como scribus "no sirve".

muy bien inkscape, espero el plugin !!!

Revision history for this message
Mumia W. (mumia-w-18) wrote :

Hello. I've discovered that inkscape writes invalid SVG files when converting PDFs to SVG.

I'm attempting to convert this PDF file to (inkscape) SVG:
http://perl.apache.org/docs/2.0/user/config/config.pdf

The resulting SVG file looks like this:
http://home.earthlink.net/~mumia.w.18.spam/docs/config.svg . Notice the spurious "^P" (Control-P) characters in it.

Attempting to load config.svg into inkscape causes inkscape to respond this way: "Failed to load the requested file /tmp/config.svg." And inkscape outputs this to the console:

config.svg:64: parser error : PCDATA invalid Char value 16

However, editing the file in VIM and changing the "^P" characters to spaces solves the problem. I am using inkscape 0.46 on Debian 5.0 i386.

Revision history for this message
Mumia W. (mumia-w-18) wrote :

Bugs 291416 and 297701 seem to be duplicates of this bug (184904).

Revision history for this message
su_v (suv-lp) wrote :

adding above mentioned reports: (in expanded notation ;-)
Bug #291416 “SVG from imported PDF cannot be reopened, characters error”
<https://bugs.launchpad.net/inkscape/+bug/291416>
Bug #297701 “Cannot open SVG file that was saved from a PDF opened with "Embed Images" option set” (linked as duplicate to bug #291416)
<https://bugs.launchpad.net/inkscape/+bug/297701>

tags: added: importing svg text
Revision history for this message
Jeff Thornhill (jeffthornhill99) wrote :

I am using v. .48 on my mac with snow leopard. I am getting the error that says that the .svg file that I just saved will not reopen. I would love to save the work if possible. I first had a crash when I changed the horizontal font distance to varying widths between characters within the same object line of text. I tried creating my project differently, keeping all the font spacing the same within each object of text. I was able to finish and save my project, but once completed and saved it would not reopen. I am liking the features of the software and the ease of use, but finding hours of your work day was for nothing is a feature that I don't want to repeat. I also had embedded png objects that were created in another program when this project crashed and would not reopen the png.

Revision history for this message
Alvin Penner (apenner) wrote :

@jeff, could you attach the svg file that is causing the crash?

Revision history for this message
Beluga (buovjaga) wrote :

The provided example file ctrlk.svg now loads OK. Plz re-test.

Win 7 64-bit
Inkscape 0.92pre1_64bit r15016

Revision history for this message
jazzynico (jazzynico) wrote :

Bug reproduced on Windows XP (32-bit) with Inkscape 0.48.4.
Not reproduced with 0.48.5, 0.91 and 0.92.x rev. 15044.

The "PCDATA invalid Char value" message still shows on console, but the file loads correctly.

Changed in inkscape:
assignee: bbyak (buliabyak) → nobody
milestone: none → 0.48.5
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.