Crash when hashing files

Bug #473173 reported by davidgf
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
LinuxDC++
Fix Released
High
Razzloss

Bug Description

My linuxdcpp 1.0.3 crashes when hashing files from an external usb drive in a reiserfs partition.

I attach the bugreport. The error is this:

linuxdcpp: xcb_io.c:176: process_responses: La declaración `!(req && current_request && !(((long) (req->sequence) - (long) (current_request)) <= 0))' no se cumple.
Multiple segmentation faults occurred; can't display error dialog

I don't know why but sometimes restarting the program solves the error, but with some concrete files it always crashes!
So I think it depens on the file.

Thanks

Revision history for this message
davidgf (david-davidgf) wrote :
Revision history for this message
Steven Sheehy (steven-sheehy) wrote :

Thanks for the report. In order to solve this issue we need a backtrace of the crash from an executable compiled in debug mode. Also, there have been a few crashing fixes in the latest bzr trunk so please grab the latest copy to see if those fixes help.

Changed in linuxdcpp:
status: New → Incomplete
Revision history for this message
davidgf (david-davidgf) wrote :

I've found the error. The problem was due to an inconsistency in the FS. I checked fixed it and now everything works well.
But I don't know why it justs crashes, Gnome gives an Input/Output error while copying the files but it doesn't crash.

David

Revision history for this message
Razzloss (razzloss) wrote :

OK, marking this confirmed. The read errors aren't handled correctly (at all) when fasthash is in use.

--RZ

Changed in linuxdcpp:
importance: Undecided → Medium
status: Incomplete → Confirmed
tags: added: core crash
Revision history for this message
Razzloss (razzloss) wrote :

Maybe something like that. It isn't pretty, but don't complain unless you have a better idea.

Now I'd be happy if someone with a broken enough system could test this. It prevented the crash when the file was truncated during mmaped reads, but don't know if it will fix the other read errors (and are those errors reported during fallback slowhash or is that slowhash even used?).

--RZ

Revision history for this message
Steven Sheehy (steven-sheehy) wrote :

This may be a duplicate of bug #311478.

I didn't realize that mmap could throw a signal instead of just returning an error like read() does. I would suggest you move the sig handler & env variable down near the fastHash function instead of in the includes section. Also, in the case of sigsetjmp the sig handler would not be unregistered. The latest HashManager.cpp handles this by breaking out of the loop instead of closing the fd and returning when there's an error.

I hope the fastHash is significantly faster than the regular hash for all the headaches that it has been giving us....

Revision history for this message
Razzloss (razzloss) wrote :

It isn't mmap that throws the signal. It's the memory access to mmapped region. There's really no other way to "return" error from something like char a = mmappedFile[342];

As for sigsetjmp case, SA_RESETHAND flag to sigaction resets the handler to SIG_DFL in case the signal occurs. Granted this is not necessarily the same as restoring the oldact, but as long as we won't be setting any other handler for SIGBUS the end result is the same. But yeah, final patch should restore oldact just in case. Also SIGBUS probably should be masked in other threads to ensure it is handled by the hashing thread. (What I've understood SIGBUS should be delivered to the thread that caused it, but multithreaded programs+signals are bit of a mess...)

--RZ

Revision history for this message
Steven Sheehy (steven-sheehy) wrote :

Razzloss, if you can clean this up with the things we discussed, I can get it committed to dcpp.

Revision history for this message
Razzloss (razzloss) wrote :

Yeah, it's been stuck on my todo pile. At least the current patch has been succesfully tested by few people with problems while hashing.

--RZ

Revision history for this message
Steven Sheehy (steven-sheehy) wrote :

Ok, if you get a chance to look at anything on your pile, please make this one a priority. We get quite a bit of complaints about this one.

Changed in linuxdcpp:
assignee: nobody → Razzloss (razzloss)
importance: Medium → High
milestone: none → 1.1.0
tags: added: hashing
Revision history for this message
Razzloss (razzloss) wrote :

Pushed the slightly modified patch to the trunk. The old handler is now reset in case of a jmp.

Tested the case of truncating file, which seems to work ok == not crashing anymore. Other types of read errors hasn't been tested with this version, but the attached patch seemed to work fine (few people in IRC tested it). So I'd assume the modifications didn't break anything.

--RZ

Changed in linuxdcpp:
status: Confirmed → Fix Committed
Revision history for this message
Steven Sheehy (steven-sheehy) wrote :

I'm working on porting it to latest dcpp trunk and the latest trunk handles exiting from fastHash differently than our version. Can you take a look at my attached patch and see if it looks ok? It also includes an extra call to munmap since memory may potentially be leaking in certain scenarios (like the final read when size_left <= 0). Also, shouldn't sigsetjmp be called before mmap since mmap could potentially throw a SIGBUS before it gets to sigsetjmp?

Revision history for this message
Razzloss (razzloss) wrote :

The patch looks ok.

Plain mmap call shouldn't cause SIGBUS, it should return an error if the file can't be mapped.

I was going to say that we could move the sigsetjmp to before the loop even begins (just to be safe and save some cpu cycles), but it probably should stay where it is. Sigsetjmp man page states that the stack context is saved at the time of the call. That includes the buf-variable, which we cannot unmap if context is set before mmap.

--RZ

Changed in linuxdcpp:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.