network-manager dep8 failure blocks dnsmasq proposed migration

Bug #1805857 reported by Robie Basak on 2018-11-29
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
dnsmasq (Ubuntu)
Status tracked in Disco
Disco
Undecided
Unassigned
network-manager (Ubuntu)
Status tracked in Disco
Disco
High
Iain Lane

Bug Description

dnsmasq 2.80-1 is blocked from migrating to the release pocket because of, amongst other things, a network-manager dep8 regression.

Robie Basak (racb) wrote :
Download full text (4.2 KiB)

I've done some investigation and my best educated guess right now is that there race somewhere, most like in the n-m dep8 tests but possibly in n-m, which is being triggered by an unrelated timing change in the newer dnsmasq and is causing some unwanted result that results in the dep8 test teardown to fail.

I reproduced by:

Creating a VM running Disco and dist-upgrading it and rebooting (without -proposed enabled).

Pulling the network-manager source and installing the test dependencies listed against "nm" in debian/tests/control

Running "sudo ./nm" from debian/tests/ by hand.

This passed, but a subsequent upgrade of dnsmasq-base (only) from disco-proposed reproduced the failure.

I isolated by:

Modifying the dep8 test to allow isolation (https://code.launchpad.net/~racb/network-manager/+git/network-manager/+merge/359837)

Running "sudo python3 -m unittest nm.ColdplugWifi.test_open_b_ip6_raonly_no_pe" from debian/tests/ by hand.

I found that after a test failure a reboot of the VM was necessary to cause the tests to work again. Swapping between the release pocket and proposed pocket version of dnsmasq-base mostly caused the failure to come and go as expected. However I believe that one time I saw the test pass against dnsmasq-base from proposed, which is why I think it's a race condition.

To get to the actual failure reason, I applied the follow patch:

--- a/debian/tests/nm.py
+++ b/debian/tests/nm.py
@@ -532,7 +538,6 @@ wpa_passphrase=12345678
     # connections and such); as it is very brittle and hard to track down
     # all remaining references to any NM* object after a test, we rather
     # run each test in a separate subprocess
- @network_test_base.run_in_subprocess
     def do_test(self, hostapd_conf, ipv6_mode, expected_max_bitrate,
                 secret=None, ip6_privacy=None):
         '''Actual test code, parameterized for the particular test case'''

This gave me the following output:

======================================================================
FAIL: test_open_b_ip6_raonly_no_pe (nm.ColdplugWifi)
Open network, 802.11b, IPv6 with only RA, PE disabled
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/network-manager/debian/tests/nm.py", line 165, in shutdown_connections
    self.assert_iface_down(self.dev_e_client)
  File "/home/ubuntu/network-manager/debian/tests/nm.py", line 195, in assert_iface_down
    self.assertIn('state DOWN', out)
AssertionError: 'state DOWN' not found in '4: eth42@veth42: <NO-CARRIER,BROADCAST,MULTICAST,UP,M-DOWN> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen 1000\n link/ether 1e:64:0d:9f:da:2e brd ff:ff:ff:ff:ff:ff\n'

----------------------------------------------------------------------
Ran 1 test in 35.639s

FAILED (failures=1)

The following didn't fix the problem:

--- a/debian/tests/nm.py
+++ b/debian/tests/nm.py
@@ -187,12 +187,18 @@ class NetworkManagerTest(network_test_base.NetworkTestBase):
         else:
             self.fail(message or 'timed out waiting for ' + str(condition))

+ @staticmethod
+ def iface_is_down(iface):
+ out = subprocess.check_output(['ip'...

Read more...

Sebastien Bacher (seb128) wrote :

Thanks Robie!

Changed in network-manager (Ubuntu):
importance: Undecided → High
tags: added: rls-dd-incoming
Will Cooke (willcooke) on 2018-12-04
Changed in network-manager (Ubuntu Disco):
assignee: nobody → Iain Lane (laney)
Sebastien Bacher (seb128) wrote :

IRC comment from upstream on #nm in case it's useful

'not that `nmcli device disconnect $DEV` and `nmcli connection down $PROFILE` do not set the interface down (like `ip link set $IF down`). Instead, the ~logically~ disconnect the interface, which essentially means that `nmcli device` reports that the device is disconnected and that there is no IP configuration on the device.

asserting that the interface is down, may just check the wrong thing.'

Iain Lane (laney) wrote :

After looking for a while, I think that the error we were managing to reproduce (LOWERLAYERDOWN) was probably a red herring, and we were reproducing some other (valid) bug. The original cause of the testsuite failure still exists.

Not entirely sure why but this only reproduces for me 'properly' when I use the real autopkgtest cloud. It's quite reliable there, so I ran a bisect and it resulted in this mail:

http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2018q4/012709.html

I freely admit my relative ignorance here, so if anyone can help please feel free to chip in.

Iain Lane (laney) wrote :

I did some more poking this morning and followed up to the thread. A change in 2.80 has made this start happening. I'm hoping that someone upstream will understand what's going on enough to propose a fix.

Jeremy Bicha (jbicha) on 2019-01-08
tags: added: disco
removed: rls-dd-incoming
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package network-manager - 1.12.6-0ubuntu3

---------------
network-manager (1.12.6-0ubuntu3) disco; urgency=medium

  * debian/tests/nm.py: Make assert_iface_down() not check the interface's
    state. We call nmclient.deactivate_connection() to terminate connections
    that the testsuite sets up, and according to upstream this is not
    guaranteed to do anything in particular to the link state. It seems that
    dnsmasq 2.80 somehow alters the previous assumption that it would be
    'state DOWN', so the implementation detail we were checking previously no
    longer holds. The testsuite does still check that the IPs are removed from
    the interface, which is logically what we want anyway. (LP: #1805857)
  * debian/control: Fix Vcs-Git

 -- Iain Lane <email address hidden> Wed, 19 Dec 2018 13:12:58 +0000

Changed in network-manager (Ubuntu Disco):
status: Triaged → Fix Released
Iain Lane (laney) wrote :

Fix Released for dnsmasq (because of the revert), but we should still keep an eye on the upstream thread.

Changed in dnsmasq (Ubuntu Disco):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers