Activity log for bug #538669

Date Who What changed Old value New value Message
2010-03-14 10:55:52 Mattias Sch. bug added bug
2010-03-14 10:55:52 Mattias Sch. attachment added Proof of concept patch for ntfsclone.c, removing unnecessary lseek()s in the output file; breaks other things. http://launchpadlibrarian.net/40924445/ntfsclone-proof-of-concept.patch
2010-03-14 11:30:18 Brian Murray tags patch
2010-03-14 11:39:18 Mattias Sch. description I noticed that the performance of ntfsclone is extremely bad when backing up to an NTFS volume. Instead of copying at a speed of several tens of MB/s (like when backing up to ext4) it slows down to about 0.5 MB/s. 1) The release of Ubuntu you are using Description: Ubuntu 9.10 (karmic) Release: 9.10 2) The version of the package you are using ntfsprogs (2.0.0-1ubuntu3) The same behaviour was observed with ntfsprogs (2.0.0-1ubuntu2) in 8.10 (intrepid). 3) What you expected to happen Fast backup at a speed of several MB/s. 4) What happened instead The backup was extremely slow (0.5 MB/s). While running the backup I noticed that the ntfs-3g process for the destination volume was running at 100% CPU usage. I suspected a bug in NTFS 3g, but that isn't the case. strace shows that NTFS 3g spends most of the time doing index and other MFT lookups, which lead me to suspect expensive or poorly implemented sparse file operations on NTFS. ntfsclone calls lseek_to_cluster() for every cluster it copies, even if the current cluster follows directly after the previous cluster. Instead ot streaming the data, NTFS 3g is forced to seek from the beginning of the destination file, which results in poor performance. In ntfsclone.c I found a note that ntfsclone also has poor performance when writing to ReiserFS. That might actually be related. I hacked a small proof of concept patch for ntfsclone that removes the unnecessary output file seeks. Now the backup of a 75 GB partition with 17 GB used takes only 1h 15m instead of the ~9h it took before. The backups are identical, so the speedup does result from the reduced seeks and not from a bug introduced by me. The patch is attached, but be aware that this is a very hacked proof of concept patch that does break other features of ntfsclone(!), such as backing up to stdout. I suspect that really fixing this problem might be non-trivial, but I unfortunately don't really know about the inner workings of ntfsclone. I noticed that the performance of ntfsclone is extremely bad when backing up to an NTFS volume. Instead of copying at a speed of several tens of MB/s (like when backing up to ext4) it slows down to about 0.5 MB/s. 1) The release of Ubuntu you are using Description: Ubuntu 9.10 (karmic) Release: 9.10 2) The version of the package you are using ntfsprogs (2.0.0-1ubuntu3) The same behaviour was observed with ntfsprogs (2.0.0-1ubuntu2) in 8.10 (intrepid). 3) What you expected to happen Fast backup at a speed of several MB/s. 4) What happened instead The backup was extremely slow (0.5 MB/s). While running the backup I noticed that the ntfs-3g process for the destination volume was running at 100% CPU usage. I suspected a bug in NTFS 3g, but that isn't the case. strace shows that NTFS 3g spends most of the time doing index and other MFT lookups, which lead me to suspect expensive or poorly implemented sparse file operations on NTFS. ntfsclone calls lseek_to_cluster() for every cluster it copies, even if the current cluster follows directly after the previous cluster. Instead ot streaming the data, NTFS 3g is forced to seek from the beginning of the sparse destination file, which results in poor performance. In ntfsclone.c I found a note that ntfsclone also has poor performance when writing to ReiserFS. That might actually be related. I hacked a small proof of concept patch for ntfsclone that removes the unnecessary output file seeks. Now the backup of a 75 GB partition with 17 GB used takes only 1h 15m instead of the ~9h it took before. The backups are identical, so the speedup does result from the reduced seeks and not from a bug introduced by me. The patch is attached, but be aware that this is a very hacked proof of concept patch that does break other features of ntfsclone(!), such as backing up to stdout. I suspect that really fixing this problem might be non-trivial, but I unfortunately don't really know about the inner workings of ntfsclone.
2010-03-23 16:01:04 etali tags patch patch patch-needswork