Comment 1 for bug 928327

Revision history for this message
Deryck Hodge (deryck) wrote :

Here's lifeless summarizing in IRC:

<lifeless> one core has damaged (I suspect killed but not joined()) threads including a missing mainloop. The missing mainloop would on its own make it appear dead to haproxy.
<lifeless> It is in gc in another thread; one possible theory is it got too big memory wise and what we are looking at is damaged fallout from some attempt to recover it
<lifeless> the other core appears entirely healthy except for the oddness that stuff is stuck in send(); but that is normal if the OS buffer is full, which will happen if the internets are not brilliantly happy (because buffering affects the entire chain)
<lifeless> so we need to know for the first one, as much as we can about how it got to that state - were any sysadmin interventions applied first? (if so, the core doesn't represent the failure, it represents the failure + mangling)
<lifeless> for the second, we need to know the symptoms that were being reported

I'll attach a complete transcript from IRC for those curious and/or working on this.