Resolver ignores ndots option

Bug #1674273 reported by Sebastian Unger
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
glibc (Ubuntu)
Confirmed
Medium
Unassigned
Xenial
Confirmed
Medium
Unassigned
linux (Ubuntu)
Invalid
Medium
Unassigned
Xenial
Invalid
Medium
Unassigned

Bug Description

This is a re-report of https://bugs.launchpad.net/ubuntu/+source/linux/+bug/401202 since that one was apparently closed as no-fix simply because it was too old.

This still occurs in xenial.

Original description:
Regardless of ndots option in /etc/resolv.conf, when NXDOMAIN is returned from the DNS server then resolver always try another attempt with the original name extended by what is in search option.
For example, if you're looking for very.long.url.nowhere and there is a line "search ubuntu.com" in resolv.conf you will get addres of server very.long.url.nowhere.ubuntu.com if such exists. It is incorrect, it should occurs only for urls having less that ndots option dots in its name.

My system is a standard Ubuntu Xenial desktop amd64 using network manager and the default configured Wired Connection 1 (i.e. DHCP).

To reproduce:
- sudo install /dev/fd/0 /etc/NetworkManager/dnsmasq.d/domain <<<'log-queries=extra'
- sudo killall dnsmasq
- ping some.long.non-existent.name
- Watch /var/log/syslog

In my case:
Mar 20 23:19:22 eragon dnsmasq[27367]: 46 127.0.0.1/40646 query[A] some.long.non-existent.name from 127.0.0.1
Mar 20 23:19:22 eragon dnsmasq[27367]: 46 127.0.0.1/40646 forwarded some.long.non-existent.name to 192.168.5.1
Mar 20 23:19:22 eragon dnsmasq[27367]: 46 127.0.0.1/40646 reply some.long.non-existent.name is NXDOMAIN
Mar 20 23:19:22 eragon dnsmasq[27367]: 47 127.0.0.1/52417 query[A] some.long.non-existent.name.sebunger.dnsalias.org from 127.0.0.1
Mar 20 23:19:22 eragon dnsmasq[27367]: 47 127.0.0.1/52417 forwarded some.long.non-existent.name.sebunger.dnsalias.org to 192.168.5.1
Mar 20 23:19:23 eragon dnsmasq[27367]: 47 127.0.0.1/52417 reply some.long.non-existent.name.sebunger.dnsalias.org is <CNAME>
Mar 20 23:19:23 eragon dnsmasq[27367]: 47 127.0.0.1/52417 reply sebunger.dnsalias.org is 203.173.156.30

My /etc/resolv.conf (which is a sym-link to ../run/resolvconf/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 127.0.1.1
search sebunger.dnsalias.org
options ndots:1

(I added the options ndots with no effect)
---
ApportVersion: 2.20.1-0ubuntu2.5
Architecture: amd64
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-id', '/dev/snd/by-path', '/dev/snd/pcmC1D0c', '/dev/snd/controlC1', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D2c', '/dev/snd/pcmC0D1c', '/dev/snd/pcmC0D1p', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/controlC0', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 16.04
EcryptfsInUse: Yes
IwConfig:
 enp2s0 no wireless extensions.

 lo no wireless extensions.
MachineType: Gigabyte Technology Co., Ltd. GA-MA770-UD3
NonfreeKernelModules: nvidia_uvm nvidia
Package: linux (not installed)
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-66-generic root=/dev/mapper/vg0-root ro rootdelay=120 quiet splash
ProcVersionSignature: Ubuntu 4.4.0-66.87-generic 4.4.44
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-66-generic N/A
 linux-backports-modules-4.4.0-66-generic N/A
 linux-firmware 1.157.8
RfKill:

Tags: xenial xenial
Uname: Linux 4.4.0-66-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm audio cdrom dialout dip floppy kvm lpadmin plugdev sambashare ssh sudo users video
_MarkForUpload: True
dmi.bios.date: 06/12/2009
dmi.bios.vendor: Award Software International, Inc.
dmi.bios.version: FB
dmi.board.name: GA-MA770-UD3
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.type: 3
dmi.chassis.vendor: Gigabyte Technology Co., Ltd.
dmi.modalias: dmi:bvnAwardSoftwareInternational,Inc.:bvrFB:bd06/12/2009:svnGigabyteTechnologyCo.,Ltd.:pnGA-MA770-UD3:pvr:rvnGigabyteTechnologyCo.,Ltd.:rnGA-MA770-UD3:rvrx.x:cvnGigabyteTechnologyCo.,Ltd.:ct3:cvr:
dmi.product.name: GA-MA770-UD3
dmi.sys.vendor: Gigabyte Technology Co., Ltd.

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1674273

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.11 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11-rc3

Changed in linux (Ubuntu):
importance: Undecided → Medium
Changed in linux (Ubuntu Xenial):
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
Sebastian Unger (sebunger44) wrote :

I don't think this is a recent regression. I have seen the symptoms for a while and have only gotten around to investigating it yesterday. Also see the linked bug which describes exactly this issue and which was raised in 2009.

However, I am inclined to believe that this is more a glibc rather than a kernel bug. In particular this is likely somewhere in libnss_dns.so, the DNS plugin to the NSS system.

Revision history for this message
Sebastian Unger (sebunger44) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected xenial
description: updated
Revision history for this message
Sebastian Unger (sebunger44) wrote : CRDA.txt

apport information

Revision history for this message
Sebastian Unger (sebunger44) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Sebastian Unger (sebunger44) wrote : JournalErrors.txt

apport information

Revision history for this message
Sebastian Unger (sebunger44) wrote : Lspci.txt

apport information

Revision history for this message
Sebastian Unger (sebunger44) wrote : Lsusb.txt

apport information

Revision history for this message
Sebastian Unger (sebunger44) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Sebastian Unger (sebunger44) wrote : ProcEnviron.txt

apport information

Revision history for this message
Sebastian Unger (sebunger44) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Sebastian Unger (sebunger44) wrote : ProcModules.txt

apport information

Revision history for this message
Sebastian Unger (sebunger44) wrote : UdevDb.txt

apport information

Revision history for this message
Sebastian Unger (sebunger44) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu Xenial):
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
pfoo (pfoo) wrote :

I think you are not hitting any bug at all ?
According to the man, ndots has nothing to do with adding or not adding search list to the query, it only determines if you need to try absolute request first OR with search list first. The first non-error result is returned to the client.

resolv.conf / ndots man page : sets a threshold for the number of dots which must appear in a name given to res_query(3) (see resolver(3)) before an initial absolute query will be made. The default for n is 1, meaning that if there are any dots in a name, the name will be *tried first* as an absolute name *before any search list elements are appended* to it. The value for this option is silently capped to 15.

Your logs with ndots:1 with a name having >ndots are following the man :
- first query is done for absolute name some.long.non-existent.name : NXDOMAIN. resolving is not done, need to continue.
- adding search list, a match is found and returned

As far as I can tell, xenial resolving is not broken, but yakkety is. That's another story.

Revision history for this message
Sebastian Unger (sebunger44) wrote :

Interesting. At the very least then the man page is inconsistent since

From man resolv.conf, search option:

Resolver queries having fewer than ndots dots (default is 1) in them will be attempted using each component of the search path in turn until a match is found.

However, I believe the subsequent query with the search-list appended is simply bad no matter whether it corresponds to the man page or not. The problem is, that the second lookup may actually return a result that is bad (namely when one of the searched domains has a wildcard) and that then gets cached. When I connect to a VPN later, the first entry isn't even tried again since we hold a cached result.

As far as I can tell, the ndots option is the resolvers way of figuring out whether a name is absolute or relative given that host names don't usually have the trailing period to indicate that they are FQDNs. So, I think, it should EITHER use the search list or not depending on ndots.

Revision history for this message
pfoo (pfoo) wrote :

Yeah the man is quite unclear on how local domain / search list is managed when resolving.

I thought dnsmasq was configured with caching disabled on ubuntu ?

I'm understanding your point of view but it needs some digging (is ubuntu even patching glibc ?).

Revision history for this message
pfoo (pfoo) wrote :

It seems like the behaviour has changed in yaketty.
- queries with fewer than ndots are only tried with search list appended, never tried as fqdn
- queries with ndots ore more are tried as fqdn directly and search list is never tried

However, in yaketty, ndots option seems to be completely ignored (or forced to 1)

Revision history for this message
pfoo (pfoo) wrote :

Yaketty has either broken ndots or enforced it to 1.

ndots:1
ping host => search is appended. Never tried as fqdn.
ping host.name => search is not appended, even if nxdomain

ndots:2
ping host => search is appended. Never tried as fqdn.
ping host.name => search is not appended, even if nxdomain. This is bad.

ndots:3
ping host => search is appended. Never tried as fqdn.
ping host.name => search is not appended, even if nxdomain. This is bad.

Expected behaviour :
For queries with less than ndots : try to resolve with search-list appended, if it fails (nxdomain), try as FQDN/absolute name.
For queries with ndots or more dot : Only resolve as FQDN/absolute name.

Ubuntu GLIBC 2.24-3ubuntu2)

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in glibc (Ubuntu Xenial):
status: New → Confirmed
Changed in glibc (Ubuntu):
status: New → Confirmed
Revision history for this message
sirianni (eric-sirianni) wrote :

For some strange reason, nslookup seems to respect ndots whereas ping does not.

With ndots:2

$ ping host.name
ping: host.name: Name or service not known

$ nslookup host.name
Server: 127.0.1.1
Address: 127.0.1.1#53

Name: host.name.mycompany.com
Address: 10.174.2.192

Changed in glibc (Ubuntu Xenial):
importance: Undecided → Medium
Changed in glibc (Ubuntu):
importance: Undecided → Medium
Changed in linux (Ubuntu):
status: Confirmed → Invalid
Changed in linux (Ubuntu Xenial):
status: Confirmed → Invalid
Revision history for this message
frigo (rigault-francois) wrote :

This still occurs in Focal.
nslookup "respects the dots" as it calls the systemd resolver, and as per man resolved.conf(5)

Domains=
           A space-separated list of domains. These domains are used as search suffixes when resolving single-label host names (domain names which contain no dot)

to get this behavior on ping you need ping to switch to the systemd resolver ("resolve" in nsswitch) which is achieved with apt install libnss-resolve

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.