fullstack fails locally after several times run due to shared dhclient lease file

Bug #1934646 reported by LIU Yulong
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Unassigned

Bug Description

Each cases are sharing the common lease path for dhclient,
for instance, in CentOS it is: /var/lib/dhclient/dhclient.leases.
That means all fullstack cases will use this file to store
fake VM's NIC DHCP lease information.

After run several times of fullstack cases, the dhclient will
get failed to set the test fake VM port's IP due to the mess
settings in this file. Errors are:
"""
# ip netns exec test-f00c713e-97df-440a-9bd0-e88a0bc5ab38 dhclient -4 -sf /opt/stack/neutron/.tox/dsvm-fullstack/bin/fullstack-dhclient-script --no-pid -d port71fc1d
Internet Systems Consortium DHCP Client 4.2.5
Copyright 2004-2013 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Can't allocate interface portd8lease {
  interface .

This version of ISC DHCP is based on the release available
on ftp.isc.org. Features have been added and other changes
have been made to the base software release in order to make
it work better with this distribution.

Please report for this software via the CentOS Bugs Database:
    http://bugs.centos.org/

exiting.
"""

The mess settings looks like this:

}
lease {
  interface "portb88816 {
  interface "port71fd02";
  fixed-address 20.0.0.115;
...

There is a "{" after the port name.

Looks like there is a race condition among different cases, so this file is rendered with broken settings.

Tags: fullstack
Changed in neutron:
status: New → In Progress
Revision history for this message
Lajos Katona (lajos-katona) wrote :
tags: added: fullstack
Revision history for this message
Bence Romsics (bence-romsics) wrote :

Hi Liu,

Is this happening in the gate too? Or do you see it only locally?

It sounds like we may want to create a tempfile and use it as a leasfile here:
https://opendev.org/openstack/neutron/src/commit/bf7a6061a061bfa08eb1ca96c0ac1bc2d5e580b9/neutron/tests/fullstack/resources/machine.py#L154-L156

Do you see tests failing that use this fixture?

Revision history for this message
Bence Romsics (bence-romsics) wrote :

Reading ISC dhclient docs I found suggestive language (though no clear claim) that the leases file can be written by multiple dhclients:

https://kb.isc.org/docs/isc-dhcp-44-manual-pages-dhcpdleases

"""
The lease file is a log-structured file - whenever a lease changes, the contents of that lease are written to the end of the file.
"""

"""
In order to prevent the lease database from growing without bound, the file is rewritten from time to time. First, a temporary lease database is created and all known leases are dumped to it. Then, the old lease database is renamed DBDIR/dhcpd.leases~. Finally, the newly written lease database is moved into place.
"""

So this tells me that either dhclient is buggy or (probably more likely) we have other code editing the leases file. I know we did some lease file editing on the dnsmasq side, however I'm not aware of any editing on the dhcp client side. Do you know any code doing that?

Changed in neutron:
importance: Undecided → Medium
Revision history for this message
LIU Yulong (dragon889) wrote :

Hi Bence,

If you have time, run "tox -e dsvm-fullstack" locally. If you see cases get failed, check the common lease file /var/lib/dhclient/dhclient.leases, the mis-configuration should be in it. Then probably you will meet this issue.

Yes, this should be mainly a dhclient bug, but Neutron can have a quick workaround.
I did not investigate the upstream gate status, maybe dhclient was fixed in Ubuntu, since I run fullstack cases in CentOS.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/799438
Committed: https://opendev.org/openstack/neutron/commit/3b46df48476fdfd5a479ad537d190474f39e395b
Submitter: "Zuul (22348)"
Branch: master

commit 3b46df48476fdfd5a479ad537d190474f39e395b
Author: LIU Yulong <email address hidden>
Date: Mon Jul 5 15:19:54 2021 +0800

    Change fullstack dhclient lease file to tmp folder

    Each cases are sharing the common lease path for dhclient,
    for instance, in CentOS it is: /var/lib/dhclient/dhclient.leases.
    That means all fullstack cases will use this file to store
    fake VM's NIC DHCP lease information.

    After run several times of fullstack cases, the dhclient will
    get failed to set the test fake VM port's IP due to the mess
    settings in this file.

    This patch sets each fake VM's NIC lease file path to the
    tmp folder with it's port id.

    This may fix some cases that cannot set the IP addr to the test
    device properly via DHCP.

    Closes-Bug: #1934646
    Change-Id: Ia87fa7c08df473acbcf1600035d99a83ed4b4375

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 19.0.0.0rc1

This issue was fixed in the openstack/neutron 19.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.