Network Manager 0.7 doesn't use resolvconf to remove nameserver info if it didn't use resolvconf for adding its nameserver info - wipes /etc/resolv.conf link

Bug #324233 reported by tpurch on 2009-02-02
238
This bug affects 40 people
Affects Status Importance Assigned to Milestone
network-manager (Ubuntu)
Medium
Unassigned

Bug Description

Binary package hint: network-manager-kde

I use vpnc and have resolvconf package loaded to allow vpnc to update dns. Network manager is replacing the symbolic link /etc/resolv.conf -> /etc/resolvconf/run/resolv.conf with a new (non-symbolic linked) /etc/resolv.conf this prevents resolvconf from updating DNS settings as it requires resolv.conf to be symbolic-link file. if I delete the /etc/resolv.conf file and symbolic link it to /etc/resolvconf/run/resolv.conf my vpn works fine and updates dns as expected, however if I reboot my machine the symbolic file is replaced by a new file that isn't symbolically linked to resolvconf/run/resolv.conf file.

network manager should update the resolv.conf file not replace it.

Seeing the same behavior. Doesn't require a reboot for me -- if NetworkManager is running, it will destroy my resolvconf symlink and write a new file every couple minutes. If I restore the symlink by hand, my VPN works until NetworkManager breaks it again.

NetworkManager 0.7~~svn20081018t105859-0ubuntu1.8.10.1
resolvconf 1.42ubuntu2
intrepid on x86_64

Frustrating problem.

I found a work around, try creating a script file
in /etc/network/if-up.d/zzz_resolvconffix.

zzz_resolvconffix:

#!/bin/sh
[ -x /sbin/resolvconf ] || exit 0

[ "x$ADDRFAM" == "xinet6" ] && exit 0

rm /etc/resolv.conf
ln -s /etc/resolvconf/run/resolv.conf /etc/resolv.conf

for NS in $DHCP4_DOMAIN_NAME_SERVERS ; do
    R="${R}nameserver $NS
"
done
echo -n "$R" | /sbin/resolvconf -a "${IFACE}.${ADDRFAM}"
#end of script file

 after creating script file and run:

chmod 755 /etc/network/if-up.d/zzz_resolvconffix

then restart network session.

On Fri, 2009-02-13 at 09:22 +0000, machrider wrote:
> Seeing the same behavior. Doesn't require a reboot for me -- if
> NetworkManager is running, it will destroy my resolvconf symlink and
> write a new file every couple minutes. If I restore the symlink by
> hand, my VPN works until NetworkManager breaks it again.
>
> NetworkManager 0.7~~svn20081018t105859-0ubuntu1.8.10.1
> resolvconf 1.42ubuntu2
> intrepid on x86_64
>
> Frustrating problem.
>

I also have the same problem . Whenever I connect to a vpn it gives me this message
resolvconf: Error: /etc/resolv.conf must be a symlink
Although functionality is not affected.

Mackenzie Morgan (maco.m) wrote :

Confirming as still existing on Jaunty.

Changed in knetworkmanager (Ubuntu):
status: New → Confirmed
Tomasz 'Zen' Napierala (tzn) wrote :

This is not only really annoying, but also a big problem for people using vpn a lot, especially when vpnc support is broken in Jaunty (non working default route)

Vpnc's default route works fine for me...The DHCP one is set as it should be.
However, when VPNC disconnects, my DNS is not set back to the local DHCP
nameservers.

Can any one suggest a work around this is driving me nuts because I use vpnc a lot

Stop the NetworkManager service, connect manually, dpkg-reconfigure
resolvconf, and then restart NetworkManager. That's what Alexander Sack
suggested I do, and it helped.

the problem here is that if networkmanager is already running and hasnt found a resolvconf managed resolv.conf (or resolvconf isnt installed) and you then install resolvconf package, networkmanager will not notice that on shutdown and just think it should wipe the resolv.conf file.

so stopping NM before installing (or dpkg-reconfigure resolvconf to fix this post-mortem) does the trick. So the bug is that NM doesn't reconsider his idea to not-use resolvconf on shutdown. I will check if we can do this better.

Changed in network-manager (Ubuntu):
assignee: nobody → Alexander Sack (asac)
importance: Undecided → Medium
status: Confirmed → Triaged
summary: - kubuntu 8.10 - Network Manager 0.7 - DHCP/resolvconf Problem
+ Network Manager 0.7 doesn't use resolvconf on shutdown if it didn't use
+ it on startup - this causes /etc/resolv.conf link to be wiped on
+ shutdown
summary: - Network Manager 0.7 doesn't use resolvconf on shutdown if it didn't use
- it on startup - this causes /etc/resolv.conf link to be wiped on
- shutdown
+ Network Manager 0.7 doesn't use resolvconf to remove nameserver info if
+ it didn't use it to add that information - this causes /etc/resolv.conf
+ link to be wiped on shutdown and makes migrating a running system to
+ resolvconf tricky
Alexander Sack (asac) wrote :

ok hope the title is now good :-P - sorry for the noise.

summary: Network Manager 0.7 doesn't use resolvconf to remove nameserver info if
- it didn't use it to add that information - this causes /etc/resolv.conf
- link to be wiped on shutdown and makes migrating a running system to
- resolvconf tricky
+ it didn't use resolvconf for adding its nameserver info - wipes
+ /etc/resolv.conf link
Ben Chavet (ben-chavet) wrote :

dpkg-reconfigure resolvconf did not help me with this problem, but I think I have worked it out. Here are the exact steps I took:

sudo /etc/init.d/NetworkManager stop
sudo aptitude remove resolvconf
sudo aptitude purge resolvconf
sudo aptitude install resolvconf
sudo /etc/init.d/NetworkManager start

the remove-then-purge steps may not have been necessary, but that's what I did and now it's working.

tuxinvader (tuxinvader) wrote :

I believe I have found the cause of this bug within nm-named-manager.c.

Should Network Manager fail to update the resolver via the resolvconf script, then it will fall through to the default behaviour (ie write details to a temp file and then overwrites /etc/resolv.conf). The code treats /sbin/resolvconf not being installed in the same way as it would treat execution of resolvconf failing, which is why we have this problem of NetworkManager sometimes overwriting the configuration.

I can think of two solutions to this problem.

1. If we find /etc/resolv.conf is a SYMLINK, follow it and update the target file.
2. Ignore resolvconf failures, if /etc/resolv.conf is a SYMLINK

I think option 1 is preferable, as it is least likely to leave you without a functioning resolver.
I have patches for both methods, and I will upload them shortly.

Cheers,
Mark

tuxinvader (tuxinvader) wrote :

Patch -> If /etc/resolv.conf is a SYMLINK, then overwrite the target file, not the SYMLINK.

Possibly fixes another bug: NM checks if tmp_resolv_conf is a SYMLINK, and if so writes to the file pointed to by the symlink. However, when it replaces resolv.conf, it will do so with the symlink, so /etc/resolv.conf becomes a symlink to the tmp file. See here: if (rename (tmp_resolv_conf, RESOLV_CONF) < 0) {

tuxinvader (tuxinvader) wrote :

Patch -> If resolvconf update fails, and /etc/resolv.conf is a SYMLINK then do nothing.

If we find /sbin/resolvconf, but the execution of it fails, then check if /etc/resolv.conf is a SYMLINK. If it is, then report success, and do not overwrite resolv.conf.

hashstat (hashstat) wrote :

After playing with resolvconf and Network Manager, I noticed that running resolvconf directly always resulted in an error due to a missing postfix configuration file (/etc/postfix/main.cf). Because resolvconf was always returning an error, Network Manager always moved on to the default behavior: replace /etc/resolv.conf. The following commands fixed it for me.

    $ sudo touch /etc/postfix/main.cf
    $ sudo rm /etc/resolv.conf
    $ sudo ln -s /etc/resolvconf/run/resolv.conf /etc/resolv.conf

I'm not sure if this is a problem in the resolvconf package (should there be a test for the existence of /etc/postfix/main.cf or for the results of postconf in /etc/resolvconf/update-libc.d/postfix) or in the postfix package (did the maintainers forget to install the configuration file that postconf depends on).

Confirming the solution provided by hashstat. The issue were /etc/resolv.conf was being set as a file and not a symlink was being experienced. After adding the dummy postfix conf file the resolv.conf could be set as a symlink and crucially stayed as a symlink after reboot.

Here are the exact steps taken:

Before:

$ ls -l /etc/resolv.conf
-rw-r--r-- 1 root root 30 2009-08-16 13:38 /etc/resolv.conf

$ sudo mkdir /etc/postfix
$ sudo touch /etc/postfix/main.cf
$ sudo rm /etc/resolv.conf
$ sudo ln -s /etc/resolvconf/run/resolv.conf /etc/resolv.conf
$ ls -l /etc/resolv.conf
lrwxrwxrwx 1 root root 31 2009-08-16 13:42 /etc/resolv.conf -> /etc/resolvconf/run/resolv.conf
$ sudo reboot

After:

$ ls -l /etc/resolv.conf
lrwxrwxrwx 1 root root 31 2009-08-16 13:42 /etc/resolv.conf -> /etc/resolvconf/run/resolv.conf

hashstat (hashstat) wrote :

As added information, I did not install postfix directly; it was installed as a dependency to the Linux Standards Base package lsb: lsb depends on lsb-core, lsb-core depends on postfix.

Mackenzie Morgan (maco.m) wrote :

Depends or Recommends? If Recommends, removing it won't harm anything. I had
it installed accidentally too because some development metapackage pulls it in
as a Recommends.

Alexander Sack (asac) wrote :

ok. i think this was supposed to work already, but the fix was not good enough. preparing a prepared fix.

Changed in network-manager (Ubuntu):
status: Triaged → In Progress
Alexander Sack (asac) wrote :

here a first patch that should fix this.

Kevin P. Fleming (k.p.fleming) wrote :

I don't think is a correct fix; NetworkManager should *not* overwrite the file pointed to by the /etc/resolv.conf symlink.

I've got two systems here running Kubuntu Karmic Koala, with NetworkManager (and kdenetworkmanager), dhclient, dnsmasq, resolvconf and openvpn installed. If boot in single user mode, ensure that /etc/resolv.conf is the proper symlink (by running 'dpkg-reconfigure resolvconf' and accepting all the defaults), then boot normally, everything is fine... NetworkManager connects to my wireless network after I log in, but /etc/resolv.conf still points only to 127.0.0.1 (as it should, since I'm running dnsmasq).

However, on a reboot, /etc/resolv.conf becomes a regular file again. So I downloaded the network-manager source, and added a bunch of logging statements in nm-named-manager.c to see what is going on. I can see that everything works properly as the network connection is brought up.

At shutdown, I can see that dispatch_resolvconf() gets called as the connection is brought down, and it calls resolvconf to update the information as it should. However, I can't see what happens after that, because rsyslog gets shutdown before NetworkManager finishes shutting down the link... but I suspect what is happening is that dispatch_resolvconf() gets called *twice* at shutdown (it does also at connection startup) and the second time nm_spawn_process() fails because the system won't allow new processes to be started, or something similar. I didn't take the time to modify NetworkManager to log its messages to a separate file (not via syslog, so it could continue logging right until the filesystem is unmounted), but as best I can tell this is the point where the /etc/resolv.conf symlink gets destroyed.

Really, I'd prefer to have NetworkManager *never* call update_resolv_conf() if the file pointed to by RESOLVCONF_PATH is executable; if it fails, then it fails, but directly writing /etc/resolv.conf is not a safe backup plan, because it will avoid any other name resolution configuration the user has setup. Can we consider just making NetworkManager use resolvconf, *or* the TARGET_SUSE method, *or* directly writing /etc/resolv.conf, but never falling back from one to the other?

Mackenzie Morgan (maco.m) wrote :

That actually sounds like there might be TWO separate cases where NM breaks
resolvconf and only one has been fixed.

Kevin P. Fleming (k.p.fleming) wrote :

In the case I outlined, the only way I could catch it happening was to shut down and then boot the system off of a Live USB installation, so I could mount the root filesystem and inspect it before NetworkManager was started up again (using the GRUB 'recovery console' and 'drop to a root shell' (without networking) still starts NetworkManager). Interestingly, it only happens one of my two systems, the one that has a 32-bit Kubuntu install on it; the other is a 64-bit install, but I doubt that's relevant, there's probably something else that is installed that allows NetworkManager to complete its shutdown process before the system stops allowing new process creation.

Per Ångström (autark) wrote :

I'm seeing this on Lucid:
NetworkManager: <info> (eth0): writing resolv.conf to /sbin/resolvconf
resolvconf: Error: /etc/resolv.conf must be a symlink

Kevin P. Fleming (k.p.fleming) wrote :

I've been running with a patched version of nm-named-manager for a few months now; the patch just removes the ability for network-manager to directly write to /etc/resolv.conf completely, and it's been completely stable. Every single time I start up, resolvconf is still in use as expected.

Paul Smith (psmith-gnu) wrote :

I don't think it's correct for NetworkManager to write directly to the file managed by resolvconf. The entire point of resolvconf is that IT'S supposed to manage the resolv.conf file. Resolvconf is very useful if you (like I do!) have one or more VPN solutions (sometimes I have to connect to two or even three at the same time!) Each of these VPN solutions has its own set of DNS servers that we want to use to use to resolve hostnames in that subdomain (local hosts that are not visible in public DNS servers).

Resolvconf manages this by maintaining a separate resolv.conf file for each INTERFACE then merging them together.

If you use this in conjunction with, for example, dnsmasq as a local DNS caching proxy server, then your /etc/resolv.conf should ALWAYS use "nameserver 127.0.0.1", and dnsmasq has some scripts it installs to configure resolvconf to configure dnsmasq to find the "real" upstream DNS servers. This works pretty well... IF AND ONLY IF you let resolvconf manage the contents of resolv.conf.

So, they way it's supposed to work is that when you want to modify resolv.conf based on bringing up a new interface, and resolvconf is available, instead of doing something like:

    echo "$RESOLVCONF" > /etc/resolv.conf

you do something like:

    echo "$RESOLVCONF" | resolvconf -a <interface>

where <interface> is the name of the new interface brought up, like tun0 or whatever. Similarly, when the interface goes down rather than rewriting /etc/resolv.conf with some kind of saved backup, all you have to do is run:

    resolvconf -d <interface>

to undo the changes made for that interface.

Personally I think it's a fundamental mistake to have the manipulation of resolv.conf embedded in code inside NetworkManager. The reality is that, unlike Windows which handles this much better (!!), UNIX/Linux handling of DNS resolving is not very good and people need to customize it. NetworkManager should provide a script that does the resolv.conf manipulation, and invoke the script, rather than doing all internally in code. This allows customization for those who require it. This scripting can be done using the typical .d directory method, etc. so that more advanced behaviors can be installed without modifying NetworkManager files directly.

psylem (subnetjet) wrote :

Thanks for responding and reminding me about this bug. I've recently discovered most of my grief with respect to VPN connections and network-manager is caused by incomplete support in network-manager for vpnc. Or perhaps I should word it that the workarounds could be much more user friendly if network-manager supported more of vpnc's features.

It took me about 12 months to figure out that there are a bunch of settings you can apply using gconf-editor to customise each and every CISCO VPN connection. Details on my blog here... http://ubergeeky.com/blog/259-networkmanager-can-ignore-vpn-dhcp

Martin Pitt (pitti) on 2011-01-07
Changed in network-manager (Ubuntu):
assignee: Alexander Sack (asac) → nobody
status: In Progress → Triaged
Ron Howe (drdhowe) wrote :

natty and kubuntu, no resolvconf

Network Manager broke nameserving when I used it to change the host's ipaddress

In order to configure a couple of access points I needed to change host's net address to 192.168.2.x for a while
and then revert to the a static ipaddress instead of the dhcp configured at system installation.

When the host was back on the local network I could ping the gateway but the browser could not find a namesever

 I eventually discovered that /etc/resolv.conf had been updated and now contained only

"# Generated by NetworkManager"

Manually adding "nameserver <gateway-address>" to /etc/resolv.conf restored nameserving

Thomas Hood (jdthood) wrote :

Thanks to Paul Smith for explaining (#27) how resolvconf is supposed to work. For a more detailed explanation please read the README file in the resolvconf package.

When resolvconf and NM are both installed the current behavior is this:

The Ubuntu version of resolvconf immediately returns an error if /etc/resolv.conf is not a symbolic link;
otherwise resolvconf runs and then may or may not return an error depending on how things go.
If resolvconf returns an error (for any reason) then NM writes information directly to /etc/resolv.conf.

This behavior is incorrect for the following reasons.

* Resolvconf should not abort if /etc/resolv.conf is not a symlink. Resolvconf does other useful things besides writing /etc/resolvconf/run/resolv.conf. For example, it writes /var/run/dnsmasq/resolv.conf if the dnsmasq package is installed.
* NM does not distinguish between /etc/resolv.conf failing to be a symlink and resolvconf returning a non-zero exit code for some other reason. If resolvconf is just returning an error condition that was returned by a hook script then it may not be appropriate to stomp on resolv.conf.

The correct behavior is as follows.

NM runs resolvconf if it is present.
Resolvconf runs whether or not /etc/resolv.conf is a symlink.

Furthermore:

If /etc/resolv.conf is not a symlink then NM writes resolver configuration information to /etc/resolv.conf.

Consistently with the preceding description, the resolvconf program in the upcoming release of the Ubuntu resolvconf package, version 1.63ubuntu1, does not abort if /etc/resolv.conf is not a symlink.

Attached is a untested patch indicating how NM should be changed to behave according to the preceding description. Other variants of the patch are possible. With this patch NM writes /etc/resolv.conf if and only if the latter is not a symlink. Another possibility is for NM to write /etc/resolv.conf if and only if the latter is not a symlink, or is a symlink to /run/network-manager/resolv.conf or something like that.

I invite interested parties to test a pre-release version of the new Ubuntu resolvconf package which is now available in my PPA: ppa:jdthood/resolvconf (https://launchpad.net/~jdthood/+archive/resolvconf). Please let me know whether or not this package works for you!

Who am I? I am one of the maintainers of resolvconf in Debian. I would prefer to leave this work up to Ubuntu folks, but these issues have been left unresolved for years. Something has to be done.

Thomas Hood (jdthood) wrote :

This has been fixed in Precise.

Changed in network-manager (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers