Comment 8 for bug 745243

Revision history for this message
Mikkel Kamstrup Erlandsen (kamstrup) wrote : Re: [dash] wrong search result in Chinese

I've been looking into CJK indexing in Xapian and the prospects are slightly dire... See http://trac.xapian.org/ticket/180

The upshot is that this will require some work. There are libs we can pull in for this (like http://code.google.com/p/cjk-tokenizer/), we'll then have to manually write some glue code to wire it up in the indexing- and query parsing subsystems for S-C and u-p-a.

Anyone with a simpler solution are more than welcome to chime in :-)