xz/bzip/gzip fails with hardlink errors on ntfs partition (files with > 1 link)

Bug #722910 reported by Jon Hood
32
This bug affects 6 people
Affects Status Importance Assigned to Milestone
gzip
New
Undecided
Unassigned
bzip2 (Ubuntu)
Confirmed
Undecided
Unassigned
xz-utils (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Binary package hint: xz-utils

After mounting an ntfs partition using dolphin in kubuntu, xz fails to compress any file on the ntfs partition.
$ [sudo] xz -9 COMP6366_V01_02_15_2011.wmv
xz: COMP6366_V01_02_15_2011.wmv: Input file has more than one hard link, skipping

More information about the file:
$ ls -la COMP6366_V01_02_15_2011.wmv
-rwxrwxrwx 2 root root 332122239 2011-02-15 16:36 COMP6366_V01_02_15_2011.wmv

Workaround: Renaming the file will allow xz to perform the compression.
$ mv COMP6366_V01_02_15_2011.wmv COMP6366_V01_02_15_2011.wmv2 && mv COMP6366_V01_02_15_2011.wmv2 COMP6366_V01_02_15_2011.wmv
$ xz -9 COMP6366_V01_02_15_2011.wmv
After mounting an ntfs partition using dolphin in kubuntu, xz fails to compress any file on the ntfs partition.
$ [sudo] xz -9 COMP6366_V01_02_15_2011.wmv
xz: COMP6366_V01_02_15_2011.wmv: Input file has more than one hard link, skipping

More information about the file:
$ ls -la COMP6366_V01_02_15_2011.wmv
-rwxrwxrwx 2 root root 332122239 2011-02-15 16:36 COMP6366_V01_02_15_2011.wmv

Workaround: Renaming the file will allow xz to perform the compression.
$ mv COMP6366_V01_02_15_2011.wmv COMP6366_V01_02_15_2011.wmv2 && mv COMP6366_V01_02_15_2011.wmv2 COMP6366_V01_02_15_2011.wmv
$ xz -9 COMP6366_V01_02_15_2011.wmv
$ ls -la COMP6366_V01_02_15_2011.wmv.xz
-rwxrwxrwx 1 root root 325158580 2011-02-15 16:36 COMP6366_V01_02_15_2011.wmv.xz

Any file that is modified or placed onto the ntfs partition from within ubuntu is fine - this bug only exists for files that already existed on my Windows XP system.

$ lsb_release -rd
Description: Ubuntu 10.10
Release: 10.10
$ xz -V
xz (XZ Utils) 4.999.9beta
liblzma 4.999.9beta
$ apt-cache policy xz-utils
xz-utils:
  Installed: 4.999.9beta+20100527-1
  Candidate: 4.999.9beta+20100527-1
  Version table:
 *** 4.999.9beta+20100527-1 0
        500 http://us.archive.ubuntu.com/ubuntu/ maverick/main amd64 Packages
        100 /var/lib/dpkg/status

ProblemType: Bug
DistroRelease: Ubuntu 10.10
Package: xz-utils 4.999.9beta+20100527-1
ProcVersionSignature: Ubuntu 2.6.35-23.40-generic 2.6.35.7
Uname: Linux 2.6.35-23-generic x86_64
NonfreeKernelModules: nvidia
Architecture: amd64
Date: Mon Feb 21 18:39:05 2011
InstallationMedia: Kubuntu 9.10 "Karmic Koala" - Release amd64 (20091027)
ProcEnviron:
 LANGUAGE=
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: xz-utils

Revision history for this message
Jon Hood (squinky86) wrote :
Revision history for this message
K1773R (k1773r) wrote :

got the same problem in lucid

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in xz-utils (Ubuntu):
status: New → Confirmed
Revision history for this message
Joshua Rocky Tuahta Purba (jrocky) wrote :

Same problem in Precise (12.04.2).

Revision history for this message
Mechanical snail (replicator-snail) wrote :

Also happens for files created on a Windows 7 computer. In my case the drive was converted from FAT using `convert` on Windows.

Revision history for this message
Mechanical snail (replicator-snail) wrote :

Same thing happens with bzip2:

user@host:/media/DISK$ bzip2 --keep FILE
bzip2: Input file FILE has 1 other link.

I haven't tried gzip since it doesn't have a --keep option, and I don't want my file deleted.

lzop works as expected.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in bzip2 (Ubuntu):
status: New → Confirmed
Revision history for this message
xhienne (xhienne) wrote :

This is the normal (yet unexplained AFAIK) behavior of many compression tools like gzip, bzip2 and xz, to refuse to compress a file with a link count > 1, unless you give the --force option. The actual questions here are "why do my NTFS files have a link count of 2?" (see COMP6366_V01_02_15_2011.wmv in the OP) and "why does this link count drop to 1 when I rename the file?"

This is due to the DOS name mangling system (see https://en.wikipedia.org/wiki/Filename_mangling and https://en.wikipedia.org/wiki/8.3_filename). When a file (say "TextFile.Mine.txt") was copied on the NTFS partition with a 8.3 mangled filename (say "TEXTFI~1.TXT"), then it is presented by the ntfs-3g driver only once, in its long form. But the file continues to be accessible by both names: "TEXTFI~1.TXT" (hidden) and "TextFile.Mine.txt" (displayed). Hence the link count of 2, one for each name. And this is also why the link count drops to 1 when you rename the file: the mangled version disappear.

You may want to reassign this bug to the ntfs-3g package, arguing that the link count should be 1. Personally I don't think this is an obvious bad choice but others may disagree.

So, what else? Should we close this bug? Actually, I don't think so. I think it should be renamed to "Gzip / bzip2 / xz refuse to compress files with a link count > 1". As said above, I have yet to read a good reason for gzip/bzip2/xz to _refuse_ to compress a file with more than one hard link. Even with the --keep option! Why not just a mere warning? Imposing a --force flag on e.g. script writers for the sole reason that files may have many hard links, does not seem quite safe to me.

gzip is a very old program (25 years old?). My guess is that this is an unfortunate legacy of a formerly legitimate behavior (for security reasons for example?) which has absolutely no raison d'être nowadays. bzip2 and xz have just blindly copied gzip's interface.

Revision history for this message
Jon Hood (squinky86) wrote :

I updated the title - thanks for the info! Great job tracking down the root cause of the issue.

summary: - xz fails with hardlink errors on ntfs partition
+ xz/bzip/gzip fails with hardlink errors on ntfs partition (files with >
+ 1 link)
Revision history for this message
xhienne (xhienne) wrote :

Thanks. IMHO, the title should no longer mention NTFS. The behavior can be observed on any filesystem.

Try this, for example, on an ext* FS:
# touch foo
# ln foo bar
# gzip bar
gzip: bar has 1 other link -- unchanged
# bzip2 bar
bzip2: Input file bar has 1 other link.
# xz bar
xz: bar: Input file has more than one hard link, skipping
# lzma bar
lzma: bar: Input file has more than one hard link, skipping

Note: I have also affected this bug to the gzip project

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.