Comment 16 for bug 116453

Revision history for this message
In , Jason Crain (jcrain) wrote :

Created attachment 112107
Remove combining characters from normalized text

This patch changes normalization so that combining characters are removed from the normalized text. This makes searching through TextPage::findText insensitive to these characters.

Also, renames unicodeNormalizeNFKC to unicodeNormalizeSearch to make it clear it's no longer doing a regular NFKC normalization.

Renames decomp_compat to decomp_compat_base because it now strips combing characters, leaving only base characters, in addition to compatibility decomposition.

Removes UnicodeCompTables.h and some compose functions. They're no longer needed since we're not recomposing the characters.

I'm not sure if UnicodeTypeTable.h and UnicodeCompTables.h are considered part of the public interface. They're included in the xpdf headers. Albert, is it OK to change these files in this way?