[SRU] Multiple intermittent socket failures during name resolutions

Bug #1804542 reported by Tateru Nino on 2018-11-21
48
This bug affects 7 people
Affects Status Importance Assigned to Milestone
BIND
Undecided
Unassigned
bind9 (Ubuntu)
Medium
Unassigned
Bionic
Medium
Lucas Kanashiro

Bug Description

[Impact]
Socket failures due to uninitialized memory during name resolution.

[Test Case]
There is no known self contained test case for this bug. However, after the feedback of the bug reporter [1][2] looks the proposed patch is properly fixing the bug.

[1] https://bugs.launchpad.net/ubuntu/+source/bind9/+bug/1804542/comments/12
[2] https://bugs.launchpad.net/ubuntu/+source/bind9/+bug/1804542/comments/13

[Regression Potential]
Things to watch for would include memory-related issues and faults relating to long names or addresses. The patch changes memory management for a data buffer used in socket communication. There could be behavioral differences between handling of ipv4 and ipv6 addresses.

[Fix]
See comment #7
This will need to be backported for bionic. Other releases either don't need it or already have it:

 bind9 | 1:9.8.1.dfsg.P1-4ubuntu0.22 | precise-updates | ×
 bind9 | 1:9.9.5.dfsg-3ubuntu0.19 | trusty-updates | ×
 bind9 | 1:9.10.3.dfsg.P4-8ubuntu1.14 | xenial-updates | ×
 bind9 | 1:9.11.3+dfsg-1ubuntu1.7 | bionic-updates | [ ] Needs fix
 bind9 | 1:9.11.4+dfsg-3ubuntu5.3 | cosmic-updates | √ has fix
 bind9 | 1:9.11.5.P1+dfsg-1ubuntu2.4 | disco-updates | √ has fix
 bind9 | 1:9.11.5.P4+dfsg-4ubuntu1 | eoan | √ has fix

[Discussion]

[Original Report]
Multiple instances like:

Nov 22 09:36:09 mound named[1510]: ../../../../lib/isc/unix/socket.c:2135: unexpected error:
Nov 22 09:36:09 mound named[1510]: internal_send: 10.0.2.11#54580: Invalid argument
Nov 22 09:36:09 mound named[1510]: client @0x7fba00513820 10.0.2.11#54580 (brave-sync.s3.dualstack.us-west-2.amazonaws.com): error sending response: invalid file

Nov 22 09:51:35 mound named[1510]: ../../../../lib/isc/unix/socket.c:2135: unexpected error:
Nov 22 09:51:35 mound named[1510]: internal_send: ************:88fd:b31b:3e3:7082#60042: Invalid argument
Nov 22 09:51:35 mound named[1510]: client @0x7fb9f8117180 ***********:88fd:b31b:3e3:7082#60042 (cdn.onenote.net): error sending response: invalid file

[public ipv6 address partially elided for privacy]

Nov 22 05:58:24 mound named[1510]: ../../../../lib/isc/unix/socket.c:2135: unexpected error:
Nov 22 05:58:24 mound named[1510]: internal_send: 10.0.2.11#63851: Invalid argument
Nov 22 05:58:24 mound named[1510]: client @0x7fba000c7690 10.0.2.11#63851 (discordapp.com): error sending response: invalid file

Not sure if these represent some genuine failure delivering to the client, or if it is an artifact of some normal condition being overreported.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: bind9 1:9.11.3+dfsg-1ubuntu1.3
ProcVersionSignature: Ubuntu 4.15.0-39.42-generic 4.15.18
Uname: Linux 4.15.0-39-generic x86_64
ApportVersion: 2.20.9-0ubuntu7.5
Architecture: amd64
Date: Thu Nov 22 10:31:02 2018
InstallationDate: Installed on 2017-07-18 (491 days ago)
InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Release amd64 (20151021)
SourcePackage: bind9
UpgradeStatus: Upgraded to bionic on 2018-11-15 (6 days ago)
modified.conffile..etc.bind.named.conf.local: [modified]
modified.conffile..etc.bind.named.conf.options: [modified]
mtime.conffile..etc.bind.named.conf.local: 2018-04-18T18:22:39.428390
mtime.conffile..etc.bind.named.conf.options: 2018-11-21T21:03:19.292677

Related branches

Tateru Nino (tateru-nino) wrote :
Robie Basak (racb) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better.

Since there isn't enough information in your report to differentiate between a local configuration problem and a bug in Ubuntu, I'm marking this bug as Incomplete.

If indeed this is a local configuration problem, you can find pointers to get help for this sort of problem here: http://www.ubuntu.com/support/community

Or if you believe that this is really a bug, then you may find it helpful to read "How to report bugs effectively" http://www.chiark.greenend.org.uk/~sgtatham/bugs.html. We'd be grateful if you could then provide steps to reproduce the problem or otherwise explain why you believe this is a bug in Ubuntu rather than a problem specific to your system or network configuration, and then change the bug status back to New.

Changed in bind9 (Ubuntu):
status: New → Incomplete
Jon Schewe (jpschewe) wrote :

I too am seeing this problem. This is on a computer that I'm using as a router using NAT. I have a dual stack IPv6 setup. My ISP gives me an IPv6 subnet that I use for my internal network. I have attached my configuration files.

I received no errors with this configuration under Ubuntu 16.04, however I do see these errors under Ubuntu 18.04. So I believe it's a change in the bind deamon, perhaps a configuration open that I need to add.

I have tried turning on full debugging and it hasn't been terribly helpful. I can do that again if you have particular debugging options that should be enabled.

What other information can I provide to aide in debugging this?

I do also have this problem:

23-Jan-2019 10:20:25.869 general: error: ../../../../lib/isc/unix/socket.c:2135: unexpected error:
23-Jan-2019 10:20:25.869 general: error: internal_send: ***.***.***.***#52218: Invalid argument
23-Jan-2019 10:20:25.869 client: warning: client @0x7f08800da560 ***.***.***.***#52218 (d3dsirqaj330mb.cloudfront.net): error sending response: invalid file

Please, note that this is a known BIND bug, that I think probably should make its way to the official Ubuntu BIND release:
https://gitlab.isc.org/isc-projects/bind9/issues/180

The main problem for this is that the patch must be backported to BIND 11, because it has marked as resolved only in BIND 13.2:
https://gitlab.isc.org/isc-projects/bind9/milestones/4

Is this patch feasible for the official BIND of the 18.04.1 release?

Andreas Hasenack (ahasenack) wrote :

Thanks for the pointers. I agree we should try to backport this.

Changed in bind9 (Ubuntu):
status: Incomplete → Triaged
importance: Undecided → Medium
tags: added: server-next
Bryce Harrington (bryce) on 2019-06-03
summary: - Multiple intermittent socket failures during name resolutions
+ [SRU] Multiple intermittent socket failures during name resolutions
Andreas Hasenack (ahasenack) wrote :

Looks like the patch that was applied is https://gitlab.isc.org/isc-projects/bind9/merge_requests/409, but it doesn't apply cleanly, so it's not an immediate candidate for the server-next queue.

tags: removed: server-next
Bryce Harrington (bryce) wrote :
Bryce Harrington (bryce) wrote :

The upstream commit message is vague in explaining what was going on, but I'm gathering there was uninitialized data in a variable that would trigger the error message? The fix appears to be switching to a static buffer and initializing it to {0}.

description: updated
tags: added: patch
Bryce Harrington (bryce) on 2019-06-08
Changed in bind9 (Ubuntu Bionic):
status: New → Triaged
importance: Undecided → Medium
Bryce Harrington (bryce) on 2019-06-08
description: updated
Bryce Harrington (bryce) on 2019-06-08
Changed in bind9 (Ubuntu):
status: Triaged → Fix Released
Changed in bind:
status: New → Fix Released
Bryce Harrington (bryce) wrote :

The originally flagged patch was [2/2] of a 2-patch set (AIUI), and the second doesn't make much sense to include without the first. The two patches also had a tiny dependence on a one tiny piece of an earlier third patch that was a smorgasbord of valgrind memcheck fixes. Applying these three commits (and omitting their modifications to CHANGES or other files) goes cleanly:

trent:~/ubuntu/Bind9/sru.1804542/bind9-gu$ patch -p1 < ../bind9/A-memset-the-remainder-of-sendcmsgbuf-to-0-in-a-attemp.patch
patching file lib/isc/unix/socket.c
Hunk #1 succeeded at 1529 (offset 4 lines).
Hunk #2 succeeded at 1550 (offset 4 lines).
Hunk #3 succeeded at 1636 (offset 4 lines).
trent:~/ubuntu/Bind9/sru.1804542/bind9-gu$ patch -p1 < ../bind9/B-Fix-socket-cmsg-buffer-usage.patch
patching file lib/isc/unix/socket.c
Hunk #1 succeeded at 373 (offset -2 lines).
Hunk #2 succeeded at 483 (offset -2 lines).
Hunk #3 succeeded at 1434 (offset -7 lines).
Hunk #4 succeeded at 1447 (offset -7 lines).
Hunk #5 succeeded at 1528 (offset -7 lines).
Hunk #6 succeeded at 1547 (offset -7 lines).
Hunk #7 succeeded at 1576 (offset -8 lines).
Hunk #8 succeeded at 1607 (offset -8 lines).
Hunk #9 succeeded at 1638 (offset -8 lines).
Hunk #10 succeeded at 1665 (offset -9 lines).
Hunk #11 succeeded at 1768 (offset -4 lines).
Hunk #12 succeeded at 1872 (offset -4 lines).
Hunk #13 succeeded at 2068 (offset -4 lines).
Hunk #14 succeeded at 2309 (offset -4 lines).
Hunk #15 succeeded at 2323 (offset -4 lines).
Hunk #16 succeeded at 2339 (offset -4 lines).
Hunk #17 succeeded at 2386 (offset -4 lines).
Hunk #18 succeeded at 2414 (offset -4 lines).
trent:~/ubuntu/Bind9/sru.1804542/bind9-gu$ patch -p1 < ../bind9/C-Use-completely-static-sized-buffers.patch
patching file lib/isc/unix/socket.c
Hunk #1 succeeded at 317 (offset -2 lines).
Hunk #2 succeeded at 402 (offset -2 lines).
Hunk #3 succeeded at 1474 (offset -7 lines).
Hunk #4 succeeded at 1552 (offset -7 lines).
Hunk #5 succeeded at 1571 (offset -7 lines).
Hunk #6 succeeded at 1602 (offset -8 lines).
Hunk #7 succeeded at 1633 (offset -8 lines).
Hunk #8 succeeded at 1658 with fuzz 1 (offset -8 lines).
Hunk #9 succeeded at 1791 (offset -4 lines).
Hunk #10 succeeded at 1894 (offset -4 lines).
Hunk #11 succeeded at 2090 (offset -4 lines).
Hunk #12 succeeded at 2310 (offset -4 lines).
Hunk #13 succeeded at 2330 (offset -4 lines).
Hunk #14 succeeded at 2775 (offset -4 lines).

The attached patch is what we should land for bionic to resolve this issue.

Bryce Harrington (bryce) on 2019-06-09
tags: added: server-next
Bryce Harrington (bryce) wrote :

Can someone help define steps to reproduce this issue? Ideally in a vanilla install or ideally in a container.

The SRU process requires a test case for validating the change, that flags the error before the package installation, and checks the absence of the error after the package installation. So, the next step to move forward with this bug is to get such a test case in hand. Once we have something in hand, I think the SRU for this can proceed normally.

Bryce Harrington (bryce) wrote :

I've packaged the patch from comment #9 into a PPA for testing purposes:

  https://launchpad.net/~bryce/+archive/ubuntu/bind9-sru-1804542

To test with this, first ensure you can reproduce the issue in a reliably consistent way, then install this package and run it for a similar period with similar conditions.

Thanks for the patch!
I've installed it on a server that showed the problem and it hasn't appeared anymore.
In next days I'll apply it to another server showing the problem with more traffic, and I'll let you know the conclusions.

I've tested the patch in two servers with heavy dns traffic showing the problem, and it has totally disappeared after installation. Also I haven't noticed any issue in the service after applying the patch.

Andreas Hasenack (ahasenack) wrote :

Bryce, I think this can perhaps be proposed and uploaded, given the previous comments. For the actual [test] section of the sru, perhaps point at the comments above.

Changed in bind9 (Ubuntu Bionic):
assignee: nobody → Lucas Kanashiro (lucaskanashiro)
description: updated
Bryce Harrington (bryce) wrote :
Changed in bind9 (Ubuntu Bionic):
status: Triaged → In Progress
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers