Incorrectly detects text files with accented characters as binary
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
flip (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
If a text file includes some accented characters then flip incorrectly deems it to be binary, so skips converting it. It only seems to convert (without -b) files containing only Ascii characters.
This is with flip 1.20-1.
For example, using the attached ingredients.txt:
$ flip -uv ingredients.txt
ingredients.txt: binary file, not converted
I expected that it would convert ingredients.txt, since it is a text file.
But file detects it as being text:
$ file ingredients.txt
ingredients.txt: UTF-8 Unicode text
As does Perl:
$ perl -wE 'say "$_ is ", -T $_ ? q[text] : q[binary] foreach @ARGV' ingredients.txt risotto.jpg
ingredients.txt is text
risotto.jpg is binary
And dos2unix spots that it is text, while also managing to avoid processing an actual binary file:
$ dos2unix ingredients.txt risotto.jpg
dos2unix: converting file ingredients.txt to Unix format ...
dos2unix: Skipping binary file risotto.jpg
So ideally flip's heuristic for determining what's a text file would be tweaked to agree with those other utilities.
Alternatively, flip's manpage should document that it's limited to Ascii files. (I know I can use -b to override the heuristic, but that's problematic when mass-converting a bunch of files where some of them are actually binary files.)
Thanks.
Status changed to 'Confirmed' because the bug affects multiple users.