Activity log for bug #1833400

Date Who What changed Old value New value Message
2019-06-19 09:47:29 Heikki Hannikainen bug added bug
2019-06-20 12:46:02 Paride Legovini bug added subscriber Paride Legovini
2019-06-20 12:55:56 Paride Legovini bind9 (Ubuntu): status New Incomplete
2019-06-20 13:34:00 Paride Legovini bug added subscriber Ubuntu Server
2019-08-02 08:03:49 Christian Ehrhardt  bug watch added https://bugs.isc.org/Public/Ticket/Display.html?id=43822
2019-08-02 08:03:54 Christian Ehrhardt  bug added subscriber Christian Ehrhardt 
2019-08-02 08:04:02 Christian Ehrhardt  nominated for series Ubuntu Xenial
2019-08-02 08:04:02 Christian Ehrhardt  bug task added bind9 (Ubuntu Xenial)
2019-08-02 08:04:07 Christian Ehrhardt  bind9 (Ubuntu): status Incomplete Fix Released
2019-08-02 08:04:11 Christian Ehrhardt  bind9 (Ubuntu Xenial): status New Incomplete
2019-08-05 05:33:42 Launchpad Janitor merge proposal linked https://code.launchpad.net/~paelzer/ubuntu/+source/bind9/+git/bind9/+merge/370942
2019-08-05 05:35:50 Christian Ehrhardt  bind9 (Ubuntu Xenial): importance Undecided Low
2019-08-07 05:31:02 Christian Ehrhardt  bind9 (Ubuntu Xenial): importance Low High
2019-08-07 05:37:59 Christian Ehrhardt  description Ubuntu xenial 16.04, bind9 1:9.10.3.dfsg.P4-8ubuntu1.14 Yesterday the named process started crashing frequently, 49 crashes so far on 49 different servers around the world (one crash each!). We did run OS upgrades yesterday, but bind9 packages were not updated at this time. This particular bind9 package version was mostly deployed out last month. Due to the sudden surge of crashes and the distribution I'm suspecting this might be triggered remotely by an incoming packet. Backtrace from the assert: 2019-06-18T21:42:16.801421+00:00 hostname named[888]: general: critical: ../../../lib/dns/dispatch.c:3691: REQUIRE((disp->attributes & 0x00000020U) != 0) failed, back trace 2019-06-18T21:42:16.801890+00:00 hostname named[888]: general: critical: #0 0x555c41aeeaf0 in ?? 2019-06-18T21:42:16.802118+00:00 hostname named[888]: general: critical: #1 0x7f475bd66eaa in ?? 2019-06-18T21:42:16.802315+00:00 hostname named[888]: general: critical: #2 0x7f475ca9f7da in ?? 2019-06-18T21:42:16.802496+00:00 hostname named[888]: general: critical: #3 0x555c41ae3195 in ?? 2019-06-18T21:42:16.802684+00:00 hostname named[888]: general: critical: #4 0x7f475bd8b420 in ?? 2019-06-18T21:42:16.802875+00:00 hostname named[888]: general: critical: #5 0x7f475b7346ba in ?? 2019-06-18T21:42:16.803056+00:00 hostname named[888]: general: critical: #6 0x7f475ae7e41d in ?? 2019-06-18T21:42:16.803245+00:00 hostname named[888]: general: critical: exiting (due to assertion failure) [Impact] * A race in the handling of the dispatcher can trigger a crash. The reason is an assertion of a case that can actually happen (rarely but it can) * The fix is very small and essentially converts the assert into an early return here a quote of the added comment: If the attribute DNS_DISPATCHATTR_NOLISTEN is not set, then the dispatch is already handling a recv; return immediately. [Test Case] * That is the hardest part on this SRU, this is a race and neither in the upstream bug [1] nor here someone was able to come up with clear repro steps. I'm afraid we might just review code and probably keep it in proposed some extra time? [Regression Potential] * The change is minimal and upstream (as well as in Ubuntu releases) for quite some time now. So I'm confident it isn't entirely broken. The old code was preventing an odd condition to happen, the new code still does only instead of an aborting assert it now is an early return. The regressions I could think of are only theoretical - like someone having a test for this and now wondering it works - not really an issue. No really the only issue I can think of is if that early return on the return path would trigger a bug as it e.g. can't handle the returned null properly. But TBH that would replace one crash (the current one) with another one, so it isn't that bad. [Other Info] * n/a --- Ubuntu xenial 16.04, bind9 1:9.10.3.dfsg.P4-8ubuntu1.14 Yesterday the named process started crashing frequently, 49 crashes so far on 49 different servers around the world (one crash each!). We did run OS upgrades yesterday, but bind9 packages were not updated at this time. This particular bind9 package version was mostly deployed out last month. Due to the sudden surge of crashes and the distribution I'm suspecting this might be triggered remotely by an incoming packet. Backtrace from the assert: 2019-06-18T21:42:16.801421+00:00 hostname named[888]: general: critical: ../../../lib/dns/dispatch.c:3691: REQUIRE((disp->attributes & 0x00000020U) != 0) failed, back trace 2019-06-18T21:42:16.801890+00:00 hostname named[888]: general: critical: #0 0x555c41aeeaf0 in ?? 2019-06-18T21:42:16.802118+00:00 hostname named[888]: general: critical: #1 0x7f475bd66eaa in ?? 2019-06-18T21:42:16.802315+00:00 hostname named[888]: general: critical: #2 0x7f475ca9f7da in ?? 2019-06-18T21:42:16.802496+00:00 hostname named[888]: general: critical: #3 0x555c41ae3195 in ?? 2019-06-18T21:42:16.802684+00:00 hostname named[888]: general: critical: #4 0x7f475bd8b420 in ?? 2019-06-18T21:42:16.802875+00:00 hostname named[888]: general: critical: #5 0x7f475b7346ba in ?? 2019-06-18T21:42:16.803056+00:00 hostname named[888]: general: critical: #6 0x7f475ae7e41d in ?? 2019-06-18T21:42:16.803245+00:00 hostname named[888]: general: critical: exiting (due to assertion failure)
2019-08-07 05:42:43 Christian Ehrhardt  description [Impact] * A race in the handling of the dispatcher can trigger a crash. The reason is an assertion of a case that can actually happen (rarely but it can) * The fix is very small and essentially converts the assert into an early return here a quote of the added comment: If the attribute DNS_DISPATCHATTR_NOLISTEN is not set, then the dispatch is already handling a recv; return immediately. [Test Case] * That is the hardest part on this SRU, this is a race and neither in the upstream bug [1] nor here someone was able to come up with clear repro steps. I'm afraid we might just review code and probably keep it in proposed some extra time? [Regression Potential] * The change is minimal and upstream (as well as in Ubuntu releases) for quite some time now. So I'm confident it isn't entirely broken. The old code was preventing an odd condition to happen, the new code still does only instead of an aborting assert it now is an early return. The regressions I could think of are only theoretical - like someone having a test for this and now wondering it works - not really an issue. No really the only issue I can think of is if that early return on the return path would trigger a bug as it e.g. can't handle the returned null properly. But TBH that would replace one crash (the current one) with another one, so it isn't that bad. [Other Info] * n/a --- Ubuntu xenial 16.04, bind9 1:9.10.3.dfsg.P4-8ubuntu1.14 Yesterday the named process started crashing frequently, 49 crashes so far on 49 different servers around the world (one crash each!). We did run OS upgrades yesterday, but bind9 packages were not updated at this time. This particular bind9 package version was mostly deployed out last month. Due to the sudden surge of crashes and the distribution I'm suspecting this might be triggered remotely by an incoming packet. Backtrace from the assert: 2019-06-18T21:42:16.801421+00:00 hostname named[888]: general: critical: ../../../lib/dns/dispatch.c:3691: REQUIRE((disp->attributes & 0x00000020U) != 0) failed, back trace 2019-06-18T21:42:16.801890+00:00 hostname named[888]: general: critical: #0 0x555c41aeeaf0 in ?? 2019-06-18T21:42:16.802118+00:00 hostname named[888]: general: critical: #1 0x7f475bd66eaa in ?? 2019-06-18T21:42:16.802315+00:00 hostname named[888]: general: critical: #2 0x7f475ca9f7da in ?? 2019-06-18T21:42:16.802496+00:00 hostname named[888]: general: critical: #3 0x555c41ae3195 in ?? 2019-06-18T21:42:16.802684+00:00 hostname named[888]: general: critical: #4 0x7f475bd8b420 in ?? 2019-06-18T21:42:16.802875+00:00 hostname named[888]: general: critical: #5 0x7f475b7346ba in ?? 2019-06-18T21:42:16.803056+00:00 hostname named[888]: general: critical: #6 0x7f475ae7e41d in ?? 2019-06-18T21:42:16.803245+00:00 hostname named[888]: general: critical: exiting (due to assertion failure) [Impact]  * A race in the handling of the dispatcher can trigger a crash.    The reason is an assertion of a case that can actually happen (rarely    but it can)  * The fix is very small and essentially converts the assert into an early    return here a quote of the added comment:      If the attribute DNS_DISPATCHATTR_NOLISTEN is not set, then      the dispatch is already handling a recv; return immediately. [Test Case]  * That is the hardest part on this SRU, this is a race and neither in the    upstream bug [1] nor here someone was able to come up with clear repro    steps. I'm afraid we might just review code and probably keep it in    proposed some extra time? [Regression Potential]  * The change is minimal and upstream (as well as in Ubuntu releases) for    quite some time now. So I'm confident it isn't entirely broken.    The old code was preventing an odd condition to happen, the new code    still does only instead of an aborting assert it now is an early    return.    The regressions I could think of are only theoretical - like someone    having a test for this and now wondering it works - not really an    issue. No really the only issue I can think of is if that early return    on the return path would trigger a bug as it e.g. can't handle the    returned null properly. But TBH that would replace one crash (the    current one) with another one, so it isn't that bad. [Other Info]  * This isn't very frequent at least to the crash DB [2] (others are :-/) but at least this one has a clearly outlined solution. [1]: https://bugs.isc.org/Public/Bug/Display.html?id=43822 [2]: https://errors.ubuntu.com/?release=Ubuntu%2016.04&package=bind9&from=2016-01-01&to=2019-07-31 --- Ubuntu xenial 16.04, bind9 1:9.10.3.dfsg.P4-8ubuntu1.14 Yesterday the named process started crashing frequently, 49 crashes so far on 49 different servers around the world (one crash each!). We did run OS upgrades yesterday, but bind9 packages were not updated at this time. This particular bind9 package version was mostly deployed out last month. Due to the sudden surge of crashes and the distribution I'm suspecting this might be triggered remotely by an incoming packet. Backtrace from the assert: 2019-06-18T21:42:16.801421+00:00 hostname named[888]: general: critical: ../../../lib/dns/dispatch.c:3691: REQUIRE((disp->attributes & 0x00000020U) != 0) failed, back trace 2019-06-18T21:42:16.801890+00:00 hostname named[888]: general: critical: #0 0x555c41aeeaf0 in ?? 2019-06-18T21:42:16.802118+00:00 hostname named[888]: general: critical: #1 0x7f475bd66eaa in ?? 2019-06-18T21:42:16.802315+00:00 hostname named[888]: general: critical: #2 0x7f475ca9f7da in ?? 2019-06-18T21:42:16.802496+00:00 hostname named[888]: general: critical: #3 0x555c41ae3195 in ?? 2019-06-18T21:42:16.802684+00:00 hostname named[888]: general: critical: #4 0x7f475bd8b420 in ?? 2019-06-18T21:42:16.802875+00:00 hostname named[888]: general: critical: #5 0x7f475b7346ba in ?? 2019-06-18T21:42:16.803056+00:00 hostname named[888]: general: critical: #6 0x7f475ae7e41d in ?? 2019-06-18T21:42:16.803245+00:00 hostname named[888]: general: critical: exiting (due to assertion failure)
2019-08-07 05:59:24 Christian Ehrhardt  bind9 (Ubuntu Xenial): status Incomplete Triaged
2019-08-07 14:45:50 Launchpad Janitor merge proposal linked https://code.launchpad.net/~paelzer/ubuntu/+source/bind9/+git/bind9/+merge/371043
2019-08-08 05:55:01 Christian Ehrhardt  merge proposal unlinked https://code.launchpad.net/~paelzer/ubuntu/+source/bind9/+git/bind9/+merge/371043
2019-08-13 18:09:40 Brian Murray bind9 (Ubuntu Xenial): status Triaged Fix Committed
2019-08-13 18:09:42 Brian Murray bug added subscriber Ubuntu Stable Release Updates Team
2019-08-13 18:09:46 Brian Murray bug added subscriber SRU Verification
2019-08-13 18:09:50 Brian Murray tags verification-needed verification-needed-xenial
2019-08-15 06:10:20 Christian Ehrhardt  tags verification-needed verification-needed-xenial verification-done verification-done-xenial
2019-09-02 09:46:26 Łukasz Zemczak removed subscriber Ubuntu Stable Release Updates Team
2019-09-02 09:46:24 Launchpad Janitor bind9 (Ubuntu Xenial): status Fix Committed Fix Released