Comment 4 for bug 926229

Revision history for this message
Jeff Lane  (bladernr) wrote :

Ok... so after talking to roadmr about this, the easiest way to get around this for checkbox sake is to remove the dhclient call from the networking/multi_nic_* jobs and insist on the user configuring all network devices prior to running checkbox. I think that's a reasonable workaround for this issue.

The root cause seems to be dhclient choking a bit... if it's run more than once on eth0, it goes into this loop. but only on eth0. for eth1, I can run dhclient to bring the device up, shut eth1 down, and use dhclient again, and each time it just spaws a new instance of dhclient for eth1.

this seems to be very easily reproducible on my server here by doing the following:

1: comment out any ethX configuration in /etc/network/interfaces
2: reboot to ensure a clean system
3: run this short script:

for x in 1 2 3; do
sudo dhclient eth0
ifconfig eth0
sleep 1
sudo ifconfig eth0 down
ifconfig
done

What you should see happening is that on the first iteration, dhclient successfully brings eth0 up, displays the output of 'ifconfig eth0' then shuts it down and the second run of ifconfig only shows the 'lo' interface active.

On the second iteration, however, the script will appear to hang. So move to a different console and 'tail -f /var/log/syslog' and you will now see the infinite loop of DHCPREQEUST, DHCPDISCOVER, and DHCPOFFER messages but never an ack.,

Move back to the first console and ctrl-c to stop dhclient which is now hung. this will move on to the 'ifconfig eth0' line again, and you'll see that eth0 was actually activated, dhclient just never realized that.

Then on the third iteration, you'll have to ctrl-c again to kill dhclient one more time.

Now, after that is complete, do a 'ps axf |grep dhclient' and you should see only one instace:

$ ps axf |grep dhclient |grep -v grep
2811 ? Ss 0:00 dhclient eth0

NOW, run that script again, but this time use eth1 (this has to be done on a system with two ethernet devices that are connected to a working LAN:

When this runs against eth1, you'll see that the script complets all three loops successfully without dhclient being hung up at all. And after it's done, redo the ps command:

$ ps axf |grep dhclient |grep -v grep
2811 ? Ss 0:00 dhclient eth0
3151 ? Ss 0:00 dhclient eth1
3238 ? Ss 0:00 dhclient eth1
3326 ? Ss 0:00 dhclient eth1

So it appears that there is actually a problem with dhclient.