EPS export of non-ASCII text

Bug #166626 reported by Bug Importer
6
Affects Status Importance Assigned to Milestone
Inkscape
Fix Released
Medium
Unassigned

Bug Description

When I export a document with text including german
umlauts to EPS format the resulting file is not
displayed correctly. All umlauts are replaced by a
number of other characters. Greek and other non-ASCII-7
characters do not work as well (see attached example).

The EPS-output in Inkscape is, e.g.:
(Bösewicht) show

In Adobe Illustrator 10 it is:
(B) sh
159.563 32.6885 mov
(\366) sh
164.902 32.6885 mov
(sewicht) sh

I am working with Inkscape build "Inkscape0506170200"
for Windows under Windows XP.

Revision history for this message
Bug Importer (bug-importer) wrote :
Revision history for this message
Buliabyak-users (buliabyak-users) wrote :

The only real problem here is how we can guess that ö is
\366. Is this the Unicode code? Or Latin-1? or some sort of
font-specific encoding? (I remember there was a mailing list
discussion on this.)

Revision history for this message
Richard Hughes (cyreve) wrote :

I couldn't get acrobat (v6) to render this after the fix, all I could
get out of it was the famous empty rectangle glyph.

Anyway, I believe my fix works based on a couple of other
programmes but I would appreciate a check with illustrator.

Revision history for this message
Bug Importer (bug-importer) wrote :

It worked with some version yesterday - at least for
umlauts. Unicode characters were still not displayed correctly.
Currently, the umlauts are not displayed correctly anymore -
 just white spaces instead. I don't know why. Perhaps
because of the new text tool? The escaped sequences in the
EPS file are still there anyway.

-- Jan, the above bug reporter

Revision history for this message
Richard Hughes (cyreve) wrote :

This has gone beyond my knowledge. Who's our ps guru
bulia?

Revision history for this message
Bug Importer (bug-importer) wrote :

I am no PS expert so the Adobe Illustrator output for an EPS
is to complicated to me. It uses some "Adobe_CoolType" stuff
that seems to handle non-ASCII-7 characters by replacing the
encoding map.

In the PS Spec there is the following PS code fragment to
use ISO-Latin-1 encoding at least enabling proper handling
of german umlauts and other ASCII-8 chars.

/Helvetica findfont
dup length dict begin
{ 1 index /FID ne
{def}
{pop pop}
ifelse
} forall
/Encoding ISOLatin1Encoding def
currentdict
end
/Helvetica-ISOLatin1 exch definefont pop

The following PS code would create proper output:

0 842 translate
0.8 -0.8 scale
/Helvetica-ISOLatin1 findfont
12 scalefont
setfont
0 0 0 setrgbcolor
[1 0 0 1 0 0] concat
[1 0 0 -1 34.17 41] concat
0 0 moveto
(B\344ume) show

-- Jan

Revision history for this message
Bug Importer (bug-importer) wrote :

Please reopen, it is not fixed!

Revision history for this message
Richard Hughes (cyreve) wrote :

OK, I've had a brief look at the nightmare that is unicode in
postscript, and I've decided it's too difficult for me. If people
want to output non-latin1 characters they should use text as
curves.

Revision history for this message
Bug Importer (bug-importer) wrote :

It is a nightmare indeed. But ISO-Latin-1 isn't supported as
well.
My proposal is to try to fix the ISO-Latin-1 problem by
changing the
encoding map and automatically convert characters to curves
at EPS export time if there encoding requires something
different from Latin-1.

-- Jan

Revision history for this message
Richard Hughes (cyreve) wrote :

OK, it seems that the test programmes I used initially ignored
the encoding and just used latin-1 anyway. I've now written a
bit of code based on your excerpt from the ps spec to switch
all fonts to latin-1 as they are used, and it does work in
acrobat.

I don't think I'll do full unicode support just yet.

Revision history for this message
Bug Importer (bug-importer) wrote :

Thanks so far

Revision history for this message
Bug Importer (bug-importer) wrote :

If automated conversion to shapes for unsupported characters
would be added as proposed below, this bug could be closed.
Unicode support would be a RFE.

Revision history for this message
Lucychili-users (lucychili-users) wrote :

text in eps: not duplicates but both text eps related.

1222689 EPS export of non-ASCII text
1333035 EPS import: text items are corrupted

Revision history for this message
Prokoudine (prokoudine) wrote :

Originator: NO

Version from current SVN actually crashes on exporting a EPS file with
cyrillic characters inside. Too bad.

Changed in inkscape:
importance: Low → Medium
status: New → Confirmed
Revision history for this message
Buliabyak-users (buliabyak-users) wrote :

no svg file to test, assuming fixed as current svn exports eps and pdf via cairo with all chars in place

Changed in inkscape:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.