Comment 4 for bug 383102

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 383102] [NEW] bzr search can't find non-ascii text

On Thu, 2009-06-04 at 09:01 +0000, Alexander Belchenko wrote:
> I think bzr-search should use files content "as is", without decoding it
> to unicode. Because there is currently no way to absolutely correctly
> guess encoding and bzr has no file properties to attach this sort of
> info to the committed content.

This works too; OTOH it would be good to handle things like png with
metadata headers more sensibly.

> In qbzr we're using special command-line option --encoding to specify
> file content encoding for diff/annotate. This approach works well.
> Default encoding is utf-8 there.

> I suggest to provide similar option to search command, e.g.
>
> bzr search тест --encoding cp1251
>
> so this encoding argument will be used to encode command-line (unicode)
> argument тест to some specific encoding and then used verbatim to search.
>
> Does it make sense for you?

It certainly works better with bzr's lack of knowledge of file
encodings. But how will bzr-search know how to output the file's
contents sensibly? (For the hit preview).

-Rob