sharing amule temp files winth Vista in ntfs partition gives BSOD

Bug #279017 reported by Nicolò Chieffo
6
Affects Status Importance Assigned to Milestone
ntfs-3g (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Binary package hint: ntfs-3g

I share the "Temp" amule directory in a ntfs filesystem so I can resume them on windows too.
Sometimes happens that when I start downloading using emule in vista I get a blue screen of death telling me of a "page fault".
This does not happen with every files, but if it happens with a file, I must remove it (or better finish the download in linux) because this files triggers the BSOD every time. The completed file does not show the BSOD

I'm using ntfs-3g 1:1.2506-1ubuntu2

Also filed a bug against amule (upstream) but they told me that the error cannot be in amule

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

Thank you for the bug report. The NTFS-3G project is very interested to help you to solve your problem. Please follow the below steps to be able to help you:

1. Reproduce the problem.

2. Create the NTFS debug information according to http://www.linux-ntfs.org/doku.php?id=ntfsclone#store_only_ntfs_metadata

3. Run 'chkdsk DRIVE: /F' and reboot into Windows to make it an effect.

4. Repeat step 1.

5. Try to reproduce the problem on Windows.

6. Send the files you got in step 1. and 3. to szaka AT ntfs-3g.org or make them available somewhere for download and let us know if step 3 helped.

Thank you, Szaka

==
NTFS-3G Lead Developer: http://ntfs-3g.org

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

Sorry, step 4 had a typo, it should be:

4. Repeat step 2.

It's important to create the NTFS debug info right after running CHKDSK and before trying to reproduce the problem.

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

And of course step 6 was supposed to say:

6. Send the files you got in step 2. and 4. to szaka AT ntfs-3g.org or make them available somewhere for download and let us know if step 3 helped.

Revision history for this message
Nicolò Chieffo (yelo3) wrote : Re: [Bug 279017] Re: sharing amule temp files winth Vista in ntfs partition gives BSOD

I have done this sequence

1- start to download something in emule ~700Mb file
2- go to windows and start emule, after the hashing, when the file
started to be uploaded I got the BSOD
3- go to linux, the partition cannot be mounted because it was not
cleanly umounted by windows
4- create NTFS debug information

ERROR:
LANG=C sudo ntfsclone --metadata --output ntfsmeta.img /dev/sda5
ntfsclone v2.0.0 (libntfs 10:0:0)
NTFS volume version: 3.1
Cluster size : 4096 bytes
Current volume size: 155753869312 bytes (155754 MB)
Current device size: 155753869824 bytes (155754 MB)
Scanning volume ...
100.00 percent completed
Accounting clusters ...
Space in use : 33880 MB (21.8%)
ERROR(95): Opening 'ntfsmeta.img' as NTFS failed: Operation not supported

exactly the same error using the command
ntfsclone --ignore-fs-check -mo inconsistent-ntfs.img /dev/sda5

both files were created and are of 154GB should I upload them to you
even if the error occurred?
please answer as soon as you can since I have to mount the ntfs
partition to continue studying

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

Use the --force option.

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

Please also send the output of

egrep -i 'ntfs|ata|sd|I/O' /var/log/daemon.log

/var/log/daemon.log could be also /var/log/messages, /var/log/messages.log, /var/log/syslog or something else distribution specific.

Revision history for this message
Nicolò Chieffo (yelo3) wrote :

Same error even with --force :( do you still want the result of the
"error" file?

Maybe I should before cleanly umount the device using linux... But
this process may hide the problem of the windows BSOD

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

What are the outputs of

ntfs-3g.probe --readwrite /dev/sda5
ntfs-3g.probe --readonly /dev/sda5

Yes, the log file is important even if we can't create yet the NTFS debug file. We will make a solution for you depending on the above results.

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

Storno the ntfs-3g.probe's. I just realized you use ntfsprogs-2.0.0 and have problem with the image file, not the NTFS device.

Please use ntfsclone from ntfsprogs 1.13.1. ntfsprogs-2.0.0 has problems.

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

I'll make you an ntfsclone binary which you can use, please wait a bit.

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

Please download http://ntfs-3g.org/download/ntfsclone-1.13.2-wip.tgz

Unpack it and use ntfsclone-1.13.2-wip instead of ntfsclone.

Thanks.

Revision history for this message
Nicolò Chieffo (yelo3) wrote :

sudo ./ntfsclone-1.13.2-wip --ignore-fs-check -f -m -o ntfsmeta.img /dev/sda5

gives the same output (extept from the version of course!). so
proceeding with probe.

sudo ntfs-3g.probe --readwrite /dev/sda5
$LogFile indicates unclean shutdown (0, 0)

sudo ntfs-3g.probe --readonly /dev/sda5
no output

I think I have to clean the log file before using ntfsclone. How do I
clean the log file? can I mount the filesystem with --force and then
umount it?

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

Well, if the journal file is unclean then ntfsclone-1.13.2-wip will fail too.

There are two choices:

1. Run ntfsfix on /dev/sda5 before running ntfsclone (any version).

2. Run ntfsfix on ntfsmeta.img then ntfsclone ntfsmeta.img.

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

Yes, --force ntfs-3g mount and unmount should also fix it.

Revision history for this message
Nicolò Chieffo (yelo3) wrote :

mounted and umounted it. now the image creates successfully.
but bzip2 is sloooooow :D
I hope that the file produced will be of a reasonable size...

after that I will
- boot into windows
- chkdsk d: /f
- reboot into linux
- collect ntfs debug data
- reboot into windows
- test if BSOD again
- tell you if BSOD again
- send you the 2 ntfs debugs

is this correct?

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

Good job! :-)

Yes, bzip2 is unfortunately very slow. We will solve this issue in the future by using a different NTFS debug image format.

The actions are correct.

I would also need the output of

egrep -3i 'ntfs|ata|sd|i/o' /var/log/daemon.log

and the name of the file which crashes Vista.

Revision history for this message
Nicolò Chieffo (yelo3) wrote :
  • log Edit (26.2 KiB, application/octet-stream; name=log)

This produces lots of output, also not related to ntfs!
attaching
the filename which crashes vista is one of the files in this dir
d:\Nicolò\Downloads\mule\temp\
maybe 001.part or 002.part, which are the files created by amule and
in which amule writes the downloaded parts.
the file was created on monday 6 october 2008.

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

NTFS interoperates with several other subsytems and if one of them fails then the problem appears as an NTFS bug but in fact it isn't. That's why more, non-NTFS info is needed, to confine where the problem is.

The good news is that no sign of any hardware problem.

However it seems the NTFS partitions are not unmounted properly, or at all by Ubuntu. This can indeed cause NTFS corruptions. This seems to happen all the time at least with /dev/sda1.

The situation is also strange with /dev/sda5 because it was only unmounted once, yesterday night (besides your today's mount/unmount which is there in the log fine) but I can't see when and how /dev/sda5 was mounted. Earlier than 3th of October? Then how could you report the problem yesterday? So something seems to be very strange about how Ubuntu handles mounts and unmounts.

The NTFS debug info files should reveal more.

Revision history for this message
Nicolò Chieffo (yelo3) wrote :

I hardly ever mount /dev/sda1, so maybe that's why you don't see any
umount. Also I had 3 hard lockups so I had to switch off my pc.

Well, I cannot understand why the mount was not reported by the log files...
I mount it through fstab
UUID=4CF8C602F8C5E9F2 /media/Dati ntfs
defaults,umask=007,gid=46 0 1

Anyway I reported the problem yesterday but I actually did the tests today

Revision history for this message
Nicolò Chieffo (yelo3) wrote :

4 ours to do the first compression!
See you tomorrow for the second :D

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

The file metadata is indeed inconsistent.

Did you start to download the file in Linux or Vista?

What is your Vista version? Do you have Vista Patches, Service Packs?

Revision history for this message
Nicolò Chieffo (yelo3) wrote :

I started the download in linux
I have vista upgraded to sp1+patches.
Now I will reboot into windows and run chkdsk and create the new debug

Revision history for this message
Nicolò Chieffo (yelo3) wrote :

Hello, I've run chkdsk and saved the ntfs debug.
Then I rebooted into windows again but I still got the BSOD!
so chkdsk does not resolve the problem :(
do you think that you still need the ntfs debug?

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

Yes, chkdsk has many problems unfortunately. If it still crashes then no need for the other debug file.

What are the Amule versions for Windows and Linux?

You don't have Windows Home Server, do you?

Microsoft is having major problems with NTFS file corruptions and they even had to hire back one of their main NTFS architects recently.
http://support.microsoft.com/kb/946676
http://blogs.zdnet.com/Bott/?p=473

Some of their corruptions are related to compressed files, and it seems, this is the case here too. But I don't have any clue why they made the file quasi-compressed (corrupted metadata) instead of continuing the file updates as non-compressed, as Linux started.

Revision history for this message
Nicolò Chieffo (yelo3) wrote :

amule 2.2.2
emule 0.49b
Do I need windows home server to do some tests?
I can have windows 2003 server, if needed.

Anyway I think that the BSOD comes even when copying the file, not
only when using emule.
Wait, I didn't enable compression... Why is the file marked
compressed? (if I understood)

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

That would be a very good observation! If BSOD comes when you copy the file then please run the below command in Linux shell. It is one line:

for i in $(seq 1 200); do dd if=/dev/zero of=sparse bs=4096 seek=$((2*$i)) count=1 conv=notrunc; done

Then try to copy the 'sparse' file on Windows. If it also BSOD then we are much closer. Thanks!

No, the file is not compressed, it's sparse only.

Revision history for this message
Nicolò Chieffo (yelo3) wrote :

Do you think that sparse files are wrongly managed by ntfs-3g?
This might be true, since when the file is completed (and though
re-created) it does not cause any BSOD.

Do you know if amule can be set up to not create sparse files? if so
we could also test if started files can be resumed by emule without
any problems.

I will also test the following: start the file in emule, resume in
amule and test back what emule does

Revision history for this message
Nicolò Chieffo (yelo3) wrote :

Wait: can't it be that emule does not use sparse file, and amule does?
so emule accesses the file "normally" which causes the page fault!

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

On Wed, 8 Oct 2008, [utf-8] Nicolò Chieffo wrote:

> Do you think that sparse files are wrongly managed by ntfs-3g?

They are fine. But your case is more complex than the usual one.
It also could be a unique, subtle problem. Nobody else reported
this problem and I also can't reproduce it.

> This might be true, since when the file is completed (and though
> re-created) it does not cause any BSOD.

If the file caused BSOD but When it's ready and no problem then please run

ntfsinfo -fv -F file_name device

on Linux and sent th efull output.

> Do you know if amule can be set up to not create sparse files? if so we
> could also test if started files can be resumed by emule without any
> problems.

I don't know if amule suports this. But I think the issue is sparse file
related.

> I will also test the following: start the file in emule, resume in
> amule and test back what emule does

Ok.

> Wait: can't it be that emule does not use sparse file, and amule does?
> so emule accesses the file "normally" which causes the page fault!

Sparse file handling is transparent and a user space application definitely
shouldn't crash the OS. This is a serious Microsoft bug but we need to
figure out where the real problem is. A file system driver should never
crash, not even with corrupted files and data.

Revision history for this message
Nicolò Chieffo (yelo3) wrote :

> They are fine. But your case is more complex than the usual one.
> It also could be a unique, subtle problem. Nobody else reported
> this problem and I also can't reproduce it.

Really? To tell the truth I've always had this error, since I bought
my laptop, on January, but I didn't test if I had the problem in oher
PCs too.
Anyway I had lots of format & installation of both linux and windows
vista, so the problem is not installation-specific.

I'll have to wait the file to be downloaded before I can attach the
output of ntfsinfo -fv -F file_name device

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

Can you write down the Vista BSOD messages, or make a photo about it?

Revision history for this message
Nicolò Chieffo (yelo3) wrote :

Copying the sparse file works without problems :(
Copying the 001.part does not give the BSOD too.
So Mabe the problem was in writing that file!

Unfortunately I can't take a photo right now, and I cannot write down
the information because they last a few seconds.
But I found this, when coming back to windows:

Problem signature
Event problem name: BlueScreen
OS Version: 6.0.6001.2.1.0.768.3
Local settings ID: 1040

Files that can describe the problem (some files could not be there
anymore) (alcuni file potrebbero non essere più disponibili)
Mini100908-01.dmp
sysdata.xml
Version.txt
(attached in a zip)

Other problem information
BCCode: 50
BCP1: A8AB8001
BCP2: 00000000
BCP3: 81CEF7B2
BCP4: 00000000
OS Version: 6_0_6001
Service Pack: 1_0
Product: 768_1

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

It's possible that this is a pure Windows bug. Either a Microsoft NTFS or a device driver related. Google found many similar crashes but none of them involved NTFS-3G, only for Windows recovery.

Here is an idea how we could prove and document this is a pure Windows bug.

1. Install Cygwin (a Linux-like environment for Windows): http://www.cygwin.com/

2. Create the file which BSOD Vista.

3. Copy the file on Windows using cygwin 'cp' command the below way:

    cp --sparse=always 001.part 001.part.win

4. Ntfsclone the NTFS volume for potential later investigation.

5. Replace 001.part (this shouldn't change the problematic NTFS metadata):

    rename 001.part 001.part.linux
    rename 001.part.win 001.part

6. Continue the download on Windows.

If Vista BSOD then it's proven that this is a pure Vista bug and unrelated to NTFS-3G.

If it works then please send the NTFS debug image file you got in step 4.

Thanks!

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

The ZIP package at http://www.flexhex.com/docs/articles/download/sparse.zip contains a CS.EXE file which is supposed to work similarly as the cygwin cp --sparse=always command. The test could be simpler.

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

I've found the below on http://channel9.msdn.com/forums/Coffeehouse/256701-QuickTime-BSODs-Vista-solution

"That eMule had to remove all the sparse file support on Vista because of Vista bugs"

So, it's very well possible this is a pure Vista bug what any software can trigger, not only NTFS-3G.

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

Cygwin cp --sparse=always doesn't work. The files are not sparse. CS.EXE also doesn't work on my x64 because it's a 32-bit EXE file. Maybe you're more lucky.

Of course it's also possible that Microsoft silently disabled sparse file support because they had too many reliability problems with them. File system development is not too easy.

Revision history for this message
Nicolò Chieffo (yelo3) wrote :

Oh, I've understood. There should be the possibility on amule to
disable sparse file support. Or at least is there a mount option for
ntfs-3g to disable sparse files?

I don't want to do all those tests, now it's quite proved that the
problem is on vista...

Revision history for this message
Nicolò Chieffo (yelo3) wrote :

In fact I didn't have this problem on XP. Have you got XP?

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

No such problem was reported with XP. Sparse support can't be disabled for NTFS-3G because it's part of the POSIX specification.

Revision history for this message
Nicolò Chieffo (yelo3) wrote :

Ok, so let's mark it as invalid for now!
Thanks and sorry

Changed in ntfs-3g:
status: New → Invalid
Revision history for this message
slow_joe (rustavenue) wrote :

Just wanted to add I had the same problem: eMule worked fine in Vista, aMule in Ubuntu then worked fine using same temp files (on an NTFS volume), but then running back in Vista again caused BSoD within a few seconds of a download commencing. BSoD ceased occurring after I cancelled all the downloads in Vista and requeued. Vista can not handle files after they have been modified using aMule.

Revision history for this message
slow_joe (rustavenue) wrote :

BTW, the error message on the blue screen was:

PAGE_FAULT_IN_NON_PAGED_AREA

Don't know if that helps anyone.

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

This is a known Vista problem handling sparse files. Please contact Microsoft. Good luck!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.