Comment 17 for bug 1999816

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

> So, testing for an :unknown value has no effect. I could not find the the exact code that explain this difference.

You mean by keeping the upstream 3.8.x patch which uses "unknown", rabbitmqctl status would still crash in the end?

Here is the current chain of calls as I see it, please correct me if I'm wrong:

Without the fix:
- app APP1 calls "rabbitmqctl status", to get status
- rabbitmqctl status triggers some events that eventually call the `df` tool, which hangs/fails/whatever, rabbitmqctl gets something that is not a number, crashes
- APP1 notices that rabbitmqctl failed, reports that in some way, or even crashes itself

With the fix ("unknown"):
- app APP1 calls "rabbitmqctl status", to get status
- rabbitmqctl goes all the way down to getting df output, which fails in the same way, reports unknown, but now rabbitmqctl instead of crashing, just propagates that "unknown" value as the disk space, and does not crash
- APP1 gets status output, tries to check disk space, and now:
  - maybe it knows how to handle the fact that "unknown" is not a number, and behaves well
  - maybe it tries to parse "unknown" or "undefined" as a number, and crashes
  - maybe it tries to parse "unknown", but gets "undefined" instead, and crashes

I understand the fix for rabbitmqctl status not crashing, but that just makes it propagate the value that originally made it crash, to its caller (and there isn't really much else it can do). Do we know of any APP1 like in the above example? Is that something that we could test? Or should we wait and see if now something else (APP1) starts crashing, and then fix that, and so on?