dhcp-all-interfaces script continues to run after os-net-config runs

Bug #1640598 reported by Dan Sneddon
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Bob Fournier

Bug Description

There is a functional bug that happens when TripleO deploys new servers and configures networking. When the newly deployed image is booted, the dhcp-all-interfaces script from image-elements runs to contact any available DHCP servers and configure the interfaces. During deployment, os-net-config runs and configures the interfaces, but even after restarting the interfaces the dhclient continues to run. Depending on the configuration, this can cause unintended IP addresses or routes to be installed on the host.

Expected results:
After os-net-config is run, the interfaces should be configured with static routes, and the dhclient should no longer be running and configuring IPs on the interfaces.

Actual results:
[root@controller-2 ~]# ip a s dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovs-system state UP qlen 1000
    link/ether 52:54:00:00:93:18 brd ff:ff:ff:ff:ff:ff
    inet 172.16.0.91/24 brd 172.16.0.255 scope global dynamic eth1
       valid_lft 2298sec preferred_lft 2298sec
    inet6 fe80::5054:ff:fe00:9318/64 scope link
       valid_lft forever preferred_lft forever

Which doesn't reflect the ifcfg script:

cat /etc/sysconfig/network-scripts/ifcfg-eth1
# This file is autogenerated by os-net-config
DEVICE=eth1
ONBOOT=yes
HOTPLUG=no
NM_CONTROLLED=no
PEERDNS=no
DEVICETYPE=ovs
TYPE=OVSPort
OVS_BRIDGE=br-isolated
BOOTPROTO=none

Output showing bug behavior:
##########

There is a <email address hidden> service running which seems to be setting up the DHCP addresses for the NICs:

systemctl status <email address hidden>
● <email address hidden> - DHCP interface eth1
   Loaded: loaded (/usr/lib/systemd/system/dhcp-interface@.service; disabled; vendor preset: disabled)
   Active: active (exited) since Fri 2016-10-07 07:50:39 UTC; 2h 10min ago
 Main PID: 803 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/system-dhcp\<email address hidden>
           └─1129 /sbin/dhclient -H localhost -1 -q -lf /var/lib/dhclient/dhclient--eth1.lease -pf /var/run/dhclient-eth1.pid eth1

Oct 07 09:59:25 controller-2.localdomain dhclient[1129]: DHCPREQUEST on eth1 to

journalctl -l -u <email address hidden> | grep bound
Oct 07 08:43:20 controller-2.localdomain dhclient[1129]: bound to 172.16.0.91 -- renewal in 1288 seconds.
Oct 07 09:32:47 controller-2.localdomain dhclient[1129]: bound to 172.16.0.91 -- renewal in 1282 seconds.
##########

The behavior is most apparent when an interface is put on a bridge, but dhclient continues to run on the original interface. The single-nic-vlans templates will demostrate this behavior on virt, for instance.

To reproduce the issue where an incorrect default route is installed, you need to have an external DHCP server running on the network where one of the interfaces is attached.

For example:
  Deploy with NIC templates here: http://paste.openstack.org/show/584794/
  (these templates put eth1 onto the "br-isolated" bridge, but eth1 continues to receive DHCP IP and routes on eth1 after os-net-config runs)

Workaround:
running "sudo systemctl restart network" seems to correct this issue. We could potentially modify the os-apply-config script which runs os-net-config to add this command after os-net-config runs.

Additional information:
This behavior may have first appeared using RHEL 7.3 images. It isn't known if there is something related to that OS version that causes this bug to appear.

There is a patch in review to make disk-image-builder only enable either the network service or NetworkManager, instead of both. It isn't yet known whether running only one network manager service will have an impact on this bug:

https://review.openstack.org/#/c/392170/

Dan Sneddon (dsneddon)
Changed in tripleo:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Dan Sneddon (dsneddon)
milestone: none → ongoing
Dan Sneddon (dsneddon)
Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-net-config 6.0.0.0b2

This issue was fixed in the openstack/os-net-config 6.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-net-config 5.1.0

This issue was fixed in the openstack/os-net-config 5.1.0 release.

Revision history for this message
Emilien Macchi (emilienm) wrote :

There are no currently open reviews on this bug, changing the status back to the previous state and unassigning. If there are active reviews related to this bug, please include links in comments.

Changed in tripleo:
status: In Progress → Triaged
assignee: Dan Sneddon (dsneddon) → nobody
Revision history for this message
Bob Fournier (bfournie) wrote :

This is fixed by https://review.openstack.org/#/c/398498/. Closing.

Changed in tripleo:
status: Triaged → Fix Released
assignee: nobody → Bob Fournier (bfournie)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.