[SRU] Multiple intermittent socket failures during name resolutions

Bug #1804542 reported by Tateru Nino
48
This bug affects 7 people
Affects Status Importance Assigned to Milestone
BIND
Fix Released
Undecided
Unassigned
bind9 (Ubuntu)
Fix Released
Medium
Unassigned
Bionic
Fix Released
Medium
Lucas Kanashiro

Bug Description

[Impact]
Socket failures due to uninitialized memory during name resolution.

[Test Case]
There is no known self contained test case for this bug. However, after the feedback of the bug reporter [1][2] looks the proposed patch is properly fixing the bug.

[1] https://bugs.launchpad.net/ubuntu/+source/bind9/+bug/1804542/comments/12
[2] https://bugs.launchpad.net/ubuntu/+source/bind9/+bug/1804542/comments/13

[Regression Potential]
Things to watch for would include memory-related issues and faults relating to long names or addresses. The patch changes memory management for a data buffer used in socket communication. There could be behavioral differences between handling of ipv4 and ipv6 addresses.

[Fix]
See comment #7
This will need to be backported for bionic. Other releases either don't need it or already have it:

 bind9 | 1:9.8.1.dfsg.P1-4ubuntu0.22 | precise-updates | ×
 bind9 | 1:9.9.5.dfsg-3ubuntu0.19 | trusty-updates | ×
 bind9 | 1:9.10.3.dfsg.P4-8ubuntu1.14 | xenial-updates | ×
 bind9 | 1:9.11.3+dfsg-1ubuntu1.7 | bionic-updates | [ ] Needs fix
 bind9 | 1:9.11.4+dfsg-3ubuntu5.3 | cosmic-updates | √ has fix
 bind9 | 1:9.11.5.P1+dfsg-1ubuntu2.4 | disco-updates | √ has fix
 bind9 | 1:9.11.5.P4+dfsg-4ubuntu1 | eoan | √ has fix

[Discussion]

[Original Report]
Multiple instances like:

Nov 22 09:36:09 mound named[1510]: ../../../../lib/isc/unix/socket.c:2135: unexpected error:
Nov 22 09:36:09 mound named[1510]: internal_send: 10.0.2.11#54580: Invalid argument
Nov 22 09:36:09 mound named[1510]: client @0x7fba00513820 10.0.2.11#54580 (brave-sync.s3.dualstack.us-west-2.amazonaws.com): error sending response: invalid file

Nov 22 09:51:35 mound named[1510]: ../../../../lib/isc/unix/socket.c:2135: unexpected error:
Nov 22 09:51:35 mound named[1510]: internal_send: ************:88fd:b31b:3e3:7082#60042: Invalid argument
Nov 22 09:51:35 mound named[1510]: client @0x7fb9f8117180 ***********:88fd:b31b:3e3:7082#60042 (cdn.onenote.net): error sending response: invalid file

[public ipv6 address partially elided for privacy]

Nov 22 05:58:24 mound named[1510]: ../../../../lib/isc/unix/socket.c:2135: unexpected error:
Nov 22 05:58:24 mound named[1510]: internal_send: 10.0.2.11#63851: Invalid argument
Nov 22 05:58:24 mound named[1510]: client @0x7fba000c7690 10.0.2.11#63851 (discordapp.com): error sending response: invalid file

Not sure if these represent some genuine failure delivering to the client, or if it is an artifact of some normal condition being overreported.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: bind9 1:9.11.3+dfsg-1ubuntu1.3
ProcVersionSignature: Ubuntu 4.15.0-39.42-generic 4.15.18
Uname: Linux 4.15.0-39-generic x86_64
ApportVersion: 2.20.9-0ubuntu7.5
Architecture: amd64
Date: Thu Nov 22 10:31:02 2018
InstallationDate: Installed on 2017-07-18 (491 days ago)
InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Release amd64 (20151021)
SourcePackage: bind9
UpgradeStatus: Upgraded to bionic on 2018-11-15 (6 days ago)
modified.conffile..etc.bind.named.conf.local: [modified]
modified.conffile..etc.bind.named.conf.options: [modified]
mtime.conffile..etc.bind.named.conf.local: 2018-04-18T18:22:39.428390
mtime.conffile..etc.bind.named.conf.options: 2018-11-21T21:03:19.292677

Related branches

Revision history for this message
Tateru Nino (tateru-nino) wrote :
Revision history for this message
Robie Basak (racb) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better.

Since there isn't enough information in your report to differentiate between a local configuration problem and a bug in Ubuntu, I'm marking this bug as Incomplete.

If indeed this is a local configuration problem, you can find pointers to get help for this sort of problem here: http://www.ubuntu.com/support/community

Or if you believe that this is really a bug, then you may find it helpful to read "How to report bugs effectively" http://www.chiark.greenend.org.uk/~sgtatham/bugs.html. We'd be grateful if you could then provide steps to reproduce the problem or otherwise explain why you believe this is a bug in Ubuntu rather than a problem specific to your system or network configuration, and then change the bug status back to New.

Changed in bind9 (Ubuntu):
status: New → Incomplete
Revision history for this message
Jon Schewe (jpschewe) wrote :

I too am seeing this problem. This is on a computer that I'm using as a router using NAT. I have a dual stack IPv6 setup. My ISP gives me an IPv6 subnet that I use for my internal network. I have attached my configuration files.

I received no errors with this configuration under Ubuntu 16.04, however I do see these errors under Ubuntu 18.04. So I believe it's a change in the bind deamon, perhaps a configuration open that I need to add.

I have tried turning on full debugging and it hasn't been terribly helpful. I can do that again if you have particular debugging options that should be enabled.

What other information can I provide to aide in debugging this?

Revision history for this message
Roberto S. Galende (roberto.s.galende) wrote :

I do also have this problem:

23-Jan-2019 10:20:25.869 general: error: ../../../../lib/isc/unix/socket.c:2135: unexpected error:
23-Jan-2019 10:20:25.869 general: error: internal_send: ***.***.***.***#52218: Invalid argument
23-Jan-2019 10:20:25.869 client: warning: client @0x7f08800da560 ***.***.***.***#52218 (d3dsirqaj330mb.cloudfront.net): error sending response: invalid file

Please, note that this is a known BIND bug, that I think probably should make its way to the official Ubuntu BIND release:
https://gitlab.isc.org/isc-projects/bind9/issues/180

The main problem for this is that the patch must be backported to BIND 11, because it has marked as resolved only in BIND 13.2:
https://gitlab.isc.org/isc-projects/bind9/milestones/4

Is this patch feasible for the official BIND of the 18.04.1 release?

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Thanks for the pointers. I agree we should try to backport this.

Changed in bind9 (Ubuntu):
status: Incomplete → Triaged
importance: Undecided → Medium
tags: added: server-next
Bryce Harrington (bryce)
summary: - Multiple intermittent socket failures during name resolutions
+ [SRU] Multiple intermittent socket failures during name resolutions
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Looks like the patch that was applied is https://gitlab.isc.org/isc-projects/bind9/merge_requests/409, but it doesn't apply cleanly, so it's not an immediate candidate for the server-next queue.

tags: removed: server-next
Revision history for this message
Bryce Harrington (bryce) wrote :
Revision history for this message
Bryce Harrington (bryce) wrote :

The upstream commit message is vague in explaining what was going on, but I'm gathering there was uninitialized data in a variable that would trigger the error message? The fix appears to be switching to a static buffer and initializing it to {0}.

description: updated
tags: added: patch
Bryce Harrington (bryce)
Changed in bind9 (Ubuntu Bionic):
status: New → Triaged
importance: Undecided → Medium
Bryce Harrington (bryce)
description: updated
Bryce Harrington (bryce)
Changed in bind9 (Ubuntu):
status: Triaged → Fix Released
Changed in bind:
status: New → Fix Released
Revision history for this message
Bryce Harrington (bryce) wrote :

The originally flagged patch was [2/2] of a 2-patch set (AIUI), and the second doesn't make much sense to include without the first. The two patches also had a tiny dependence on a one tiny piece of an earlier third patch that was a smorgasbord of valgrind memcheck fixes. Applying these three commits (and omitting their modifications to CHANGES or other files) goes cleanly:

trent:~/ubuntu/Bind9/sru.1804542/bind9-gu$ patch -p1 < ../bind9/A-memset-the-remainder-of-sendcmsgbuf-to-0-in-a-attemp.patch
patching file lib/isc/unix/socket.c
Hunk #1 succeeded at 1529 (offset 4 lines).
Hunk #2 succeeded at 1550 (offset 4 lines).
Hunk #3 succeeded at 1636 (offset 4 lines).
trent:~/ubuntu/Bind9/sru.1804542/bind9-gu$ patch -p1 < ../bind9/B-Fix-socket-cmsg-buffer-usage.patch
patching file lib/isc/unix/socket.c
Hunk #1 succeeded at 373 (offset -2 lines).
Hunk #2 succeeded at 483 (offset -2 lines).
Hunk #3 succeeded at 1434 (offset -7 lines).
Hunk #4 succeeded at 1447 (offset -7 lines).
Hunk #5 succeeded at 1528 (offset -7 lines).
Hunk #6 succeeded at 1547 (offset -7 lines).
Hunk #7 succeeded at 1576 (offset -8 lines).
Hunk #8 succeeded at 1607 (offset -8 lines).
Hunk #9 succeeded at 1638 (offset -8 lines).
Hunk #10 succeeded at 1665 (offset -9 lines).
Hunk #11 succeeded at 1768 (offset -4 lines).
Hunk #12 succeeded at 1872 (offset -4 lines).
Hunk #13 succeeded at 2068 (offset -4 lines).
Hunk #14 succeeded at 2309 (offset -4 lines).
Hunk #15 succeeded at 2323 (offset -4 lines).
Hunk #16 succeeded at 2339 (offset -4 lines).
Hunk #17 succeeded at 2386 (offset -4 lines).
Hunk #18 succeeded at 2414 (offset -4 lines).
trent:~/ubuntu/Bind9/sru.1804542/bind9-gu$ patch -p1 < ../bind9/C-Use-completely-static-sized-buffers.patch
patching file lib/isc/unix/socket.c
Hunk #1 succeeded at 317 (offset -2 lines).
Hunk #2 succeeded at 402 (offset -2 lines).
Hunk #3 succeeded at 1474 (offset -7 lines).
Hunk #4 succeeded at 1552 (offset -7 lines).
Hunk #5 succeeded at 1571 (offset -7 lines).
Hunk #6 succeeded at 1602 (offset -8 lines).
Hunk #7 succeeded at 1633 (offset -8 lines).
Hunk #8 succeeded at 1658 with fuzz 1 (offset -8 lines).
Hunk #9 succeeded at 1791 (offset -4 lines).
Hunk #10 succeeded at 1894 (offset -4 lines).
Hunk #11 succeeded at 2090 (offset -4 lines).
Hunk #12 succeeded at 2310 (offset -4 lines).
Hunk #13 succeeded at 2330 (offset -4 lines).
Hunk #14 succeeded at 2775 (offset -4 lines).

The attached patch is what we should land for bionic to resolve this issue.

Bryce Harrington (bryce)
tags: added: server-next
Revision history for this message
Bryce Harrington (bryce) wrote :

Can someone help define steps to reproduce this issue? Ideally in a vanilla install or ideally in a container.

The SRU process requires a test case for validating the change, that flags the error before the package installation, and checks the absence of the error after the package installation. So, the next step to move forward with this bug is to get such a test case in hand. Once we have something in hand, I think the SRU for this can proceed normally.

Revision history for this message
Bryce Harrington (bryce) wrote :

I've packaged the patch from comment #9 into a PPA for testing purposes:

  https://launchpad.net/~bryce/+archive/ubuntu/bind9-sru-1804542

To test with this, first ensure you can reproduce the issue in a reliably consistent way, then install this package and run it for a similar period with similar conditions.

Revision history for this message
Roberto S. Galende (roberto.s.galende) wrote :

Thanks for the patch!
I've installed it on a server that showed the problem and it hasn't appeared anymore.
In next days I'll apply it to another server showing the problem with more traffic, and I'll let you know the conclusions.

Revision history for this message
Roberto S. Galende (roberto.s.galende) wrote :

I've tested the patch in two servers with heavy dns traffic showing the problem, and it has totally disappeared after installation. Also I haven't noticed any issue in the service after applying the patch.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Bryce, I think this can perhaps be proposed and uploaded, given the previous comments. For the actual [test] section of the sru, perhaps point at the comments above.

Changed in bind9 (Ubuntu Bionic):
assignee: nobody → Lucas Kanashiro (lucaskanashiro)
description: updated
Revision history for this message
Bryce Harrington (bryce) wrote :
Changed in bind9 (Ubuntu Bionic):
status: Triaged → In Progress
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Tateru, or anyone else affected,

Accepted bind9 into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/bind9/1:9.11.3+dfsg-1ubuntu1.10 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in bind9 (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-bionic
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Roberto, could you re-verify if the changes fix your bug by using the packages in bionic-proposed? Thanks!

Revision history for this message
Roberto S. Galende (roberto.s.galende) wrote :

Hi Łukasz,
I'm testing the proposed patch (https://launchpad.net/ubuntu/+source/bind9/1:9.11.3+dfsg-1ubuntu1.10/+build/17958751).

As with the previous patch by Bryce (https://launchpad.net/~bryce/+archive/ubuntu/bind9-sru-1804542), everything seems ok by now, with correct operation and absence of the logs that originated this issue.

I'll give it some more days, and after that I'll apply the proposed patch on another server with even more traffic, and I'll let you know the conclusions.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Please remember that in order for this update to be released, the tags need to be updated, as explained in comment #16.

Thanks for testing!

Revision history for this message
Andrei Popa (andrei-popa) wrote :

Hi,

I had the same problem on bionic and after using the proposed version from this bug report the error does not display anymore.

Revision history for this message
Juri Haberland (haberland) wrote :

Had this problem on one of my servers (the only one running bind on Bionic), installed this proposed-update package on Thursday morning, 31st, and have not seen this error since then. Searching through my logs I saw this error appearing at least every second day - often more than once a day - and have not seen it now for more than four days!

IMO this fixes the issue.

Revision history for this message
Roberto S. Galende (roberto.s.galende) wrote :

I've installed the patch on a 2nd server with heavy traffic, and all run as flawlessly as with the previous patch. I think this patch totally fix the issue and it does not introduce bad behaviours (as far as I've seen).

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package bind9 - 1:9.11.3+dfsg-1ubuntu1.10

---------------
bind9 (1:9.11.3+dfsg-1ubuntu1.10) bionic; urgency=medium

  * d/p/fix-socket-failures-during-name-resolution.patch: fix socket failures
    due to uninitialized memory during name resolution (LP: #1804542)

 -- Lucas Kanashiro <email address hidden> Mon, 30 Sep 2019 15:39:12 -0300

Changed in bind9 (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Robie Basak (racb) wrote : Update Released

The verification of the Stable Release Update for bind9 has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.