Comment 1 for bug 1805857

Revision history for this message
Robie Basak (racb) wrote :

I've done some investigation and my best educated guess right now is that there race somewhere, most like in the n-m dep8 tests but possibly in n-m, which is being triggered by an unrelated timing change in the newer dnsmasq and is causing some unwanted result that results in the dep8 test teardown to fail.

I reproduced by:

Creating a VM running Disco and dist-upgrading it and rebooting (without -proposed enabled).

Pulling the network-manager source and installing the test dependencies listed against "nm" in debian/tests/control

Running "sudo ./nm" from debian/tests/ by hand.

This passed, but a subsequent upgrade of dnsmasq-base (only) from disco-proposed reproduced the failure.

I isolated by:

Modifying the dep8 test to allow isolation (https://code.launchpad.net/~racb/network-manager/+git/network-manager/+merge/359837)

Running "sudo python3 -m unittest nm.ColdplugWifi.test_open_b_ip6_raonly_no_pe" from debian/tests/ by hand.

I found that after a test failure a reboot of the VM was necessary to cause the tests to work again. Swapping between the release pocket and proposed pocket version of dnsmasq-base mostly caused the failure to come and go as expected. However I believe that one time I saw the test pass against dnsmasq-base from proposed, which is why I think it's a race condition.

To get to the actual failure reason, I applied the follow patch:

--- a/debian/tests/nm.py
+++ b/debian/tests/nm.py
@@ -532,7 +538,6 @@ wpa_passphrase=12345678
     # connections and such); as it is very brittle and hard to track down
     # all remaining references to any NM* object after a test, we rather
     # run each test in a separate subprocess
- @network_test_base.run_in_subprocess
     def do_test(self, hostapd_conf, ipv6_mode, expected_max_bitrate,
                 secret=None, ip6_privacy=None):
         '''Actual test code, parameterized for the particular test case'''

This gave me the following output:

======================================================================
FAIL: test_open_b_ip6_raonly_no_pe (nm.ColdplugWifi)
Open network, 802.11b, IPv6 with only RA, PE disabled
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/network-manager/debian/tests/nm.py", line 165, in shutdown_connections
    self.assert_iface_down(self.dev_e_client)
  File "/home/ubuntu/network-manager/debian/tests/nm.py", line 195, in assert_iface_down
    self.assertIn('state DOWN', out)
AssertionError: 'state DOWN' not found in '4: eth42@veth42: <NO-CARRIER,BROADCAST,MULTICAST,UP,M-DOWN> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen 1000\n link/ether 1e:64:0d:9f:da:2e brd ff:ff:ff:ff:ff:ff\n'

----------------------------------------------------------------------
Ran 1 test in 35.639s

FAILED (failures=1)

The following didn't fix the problem:

--- a/debian/tests/nm.py
+++ b/debian/tests/nm.py
@@ -187,12 +187,18 @@ class NetworkManagerTest(network_test_base.NetworkTestBase):
         else:
             self.fail(message or 'timed out waiting for ' + str(condition))

+ @staticmethod
+ def iface_is_down(iface):
+ out = subprocess.check_output(['ip', 'a', 'show', 'dev', iface],
+ universal_newlines=True)
+ return 'state DOWN' in out
+
     def assert_iface_down(self, iface):
         '''Assert that client interface is down'''

+ self.assertEventually(lambda: self.iface_is_down(iface))
         out = subprocess.check_output(['ip', 'a', 'show', 'dev', iface],
                                       universal_newlines=True)
- self.assertIn('state DOWN', out)
         self.assertNotIn('inet 192', out)
         self.assertNotIn('inet6 2600', out)

So I think the race is causing some change prior to this point.

I don't think it's clear that this is definitely not a dnsmasq regression, but it seems more likely to me to be an issue in n-m or its tests. I don't think it'd be useful for me to learn n-m internals wearing my server team hat to fix this since the server team doesn't get much involved in n-m, so I'm leaving this to the desktop team for now. Please let me know though if you disagree with my assessment or if it starts looking like a dnsmasq regression, and I'll be happy to look again.