Occasional Assertion Failure causing crash in dispose_chunk in chunk.c

Bug #1168593 reported by Sean Young
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
polipo (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Ubuntu version:
Description: Ubuntu 12.04.2 LTS
Release: 12.04

Polipo version 1.0.4.1-1.1
With index fix patch in bug #988780
Compiled with -g and -rdynamic

Well, like many others, I got the (or rather, a) random crash bug.
So, I went ahead and compiled it with extra debug info.

I think I need to compile it with -g3 as there are no line numbers in the stack trace, but hopefully this is enough to track down a little further.
I know that dispose_chunk is in chunk.c from previous experience.

The only assertion in dispose_chunk (both of them) seems to be a null pointer check in an assert. Somehow, the caller of this function is erronously giving it a null pointer sometimes.

A potential, short-term workaround would be to have it simply ignore the free chunk request (by returning early), and maybe log a warning, if a null is passed in, instead of crashing.

For some reason, apport is not letting upload the report, so I am just going to attach the output from the crash repport as given by apport-cli.

There seems to be another form of an assertion failure, so I think this is not the only thing causing the random crashes.

Related branches

Revision history for this message
Sean Young (techsy730) wrote :
Revision history for this message
Sean Young (techsy730) wrote :

I will also try to see if I can upload a patch implementing the suggested workaround for chunk.c, until the true reason for the bug is found.

Revision history for this message
Sean Young (techsy730) wrote :

Also, when you are not compiling with debug flags, are the asserts disabled? If so, the derefence of that null pointer could explain why some people see segfaults instead of assertion failures.

Revision history for this message
Sean Young (techsy730) wrote :

Alright, recompiled with libsegfault as one of the included libraries (forcing better line numbers in dumps), and got the line number of the assertion failure, chunk.c:398:dispose_chunk.

Line numbers for the whole stack still aren't showing up (ugh), but I can see that it is happening when it is called from the method httpServerHandler(), when that method is called by do_scheduled_stream()

Revision history for this message
Rolf Leggewie (r0lf) wrote :

For those of you who would like to test my wip packages before I upload them to Debian, you can get them for precise (and later) from my stable PPA: https://launchpad.net/~r0lf/+archive/stable

Please kindly let me know if you can reproduce the issue with the latest package.

Changed in polipo (Ubuntu):
status: New → Incomplete
Revision history for this message
Sean Young (techsy730) wrote :

Well, the crashes are now much, MUCH, rarer, I just got one. (First time since I got Rolf's branch of it).

Sadly, as it was compiled without symbol tables, the stack trace was pretty much useless. Should I rebuild with symbols so I can get a useful stacktrace?

Revision history for this message
Rolf Leggewie (r0lf) wrote :

Sean, thank you for reporting back.

If you can help me compile the package with symbol tables for debugging, I'll gladly upload that (and the latest fixes).

Revision history for this message
Rolf Leggewie (r0lf) wrote :

ping

Changed in polipo (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Rolf Leggewie (r0lf) wrote :

I'm currently packaging the latest upstream release and I would love to incorporate your work on packaging the debugging information. I'd appreciate your input.

Revision history for this message
Sean Young (techsy730) wrote :

Sorry for the delayed response.

It's been a while since I did this, but how I debugged this was adding -g -rdynamic to the build options to insure that GCC would not strip symbols needed to backtrace, as well as using nostrib in the DEB_BUILD_OPTIONS so that debian's build script thing wouldn't add on any symbol stripping options.

Also, I think I played around with libSegFault (like libSegFault.so in the LD_PRELOAD), but I can't remember if that helped any.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package polipo - 1.1.1-1

---------------
polipo (1.1.1-1) unstable; urgency=medium

  * release newest upstream version 1.1.1. Closes: #746750
    - use regerror to improve error reporting. Closes: #585766
    - correct return value from processObject in error case. LP: #988780
    - resume interrupted file download properly. Closes: #517479
    - vastly improve stability. LP: #1055418, LP: #988780, LP: #1168593
      Closes: #684742, Closes: #307801, Closes: #432561, Closes: #646709
    - handle non-existent host properly. Closes: #655851
    - Don't revalidate objects >= CHUNK_SIZE that aren't in the disk cache.
      Closes: #610199
  * drop patches included upstream and rebase the rest
  * update watch file

 -- Rolf Leggewie <email address hidden> Fri, 06 Jun 2014 15:29:19 +0200

Changed in polipo (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.