Comment 25 for bug 808894

Revision history for this message
In , Adrian Johnson (ajohnson-redneon) wrote :

Created attachment 58178
convert utf-16 to ucs-4 when reading ToUnicode

The next two patches fix the problem of "fix selection of glyphs in actualtext" not handling surrogates. The "Unicode" type is meant to be UCS-4 so the solution is to convert UTF-16 to UCS-4 when it the ToUnicode cmap is parsed.

This patch does the UTF-16 conversion in CharCodeToUnicode.cc. As a result the special surrogate handling in TextOutputDev and HtmlOutputDev can be removed.