Incorrectly detects text files with accented characters as binary

Bug #1199404 reported by Smylers on 2013-07-09
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
flip (Ubuntu)
Undecided
Unassigned

Bug Description

If a text file includes some accented characters then flip incorrectly deems it to be binary, so skips converting it. It only seems to convert (without -b) files containing only Ascii characters.

This is with flip 1.20-1.

For example, using the attached ingredients.txt:

  $ flip -uv ingredients.txt
  ingredients.txt: binary file, not converted

I expected that it would convert ingredients.txt, since it is a text file.

But file detects it as being text:

  $ file ingredients.txt
  ingredients.txt: UTF-8 Unicode text

As does Perl:

  $ perl -wE 'say "$_ is ", -T $_ ? q[text] : q[binary] foreach @ARGV' ingredients.txt risotto.jpg
  ingredients.txt is text
  risotto.jpg is binary
And dos2unix spots that it is text, while also managing to avoid processing an actual binary file:

  $ dos2unix ingredients.txt risotto.jpg
  dos2unix: converting file ingredients.txt to Unix format ...
  dos2unix: Skipping binary file risotto.jpg

So ideally flip's heuristic for determining what's a text file would be tweaked to agree with those other utilities.

Alternatively, flip's manpage should document that it's limited to Ascii files. (I know I can use -b to override the heuristic, but that's problematic when mass-converting a bunch of files where some of them are actually binary files.)

Thanks.

Smylers (smylers) wrote :
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in flip (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers