linuxbridge/ovs jobs fails randomly as Test vms not getting ip from dhcp agent

Bug #2073251 reported by yatin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Triaged
Critical
yatin

Bug Description

Fails as:-
Traceback (most recent call last):
  File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 84, in wrapper
    return func(*func_args, **func_kwargs)
  File "/opt/stack/tempest/.tox/tempest/lib/python3.10/site-packages/neutron_tempest_plugin/scenario/test_dhcp.py", line 89, in test_extra_dhcp_opts
    vm_resolv_conf = ssh_client.exec_command(
  File "/opt/stack/tempest/.tox/tempest/lib/python3.10/site-packages/tenacity/__init__.py", line 330, in wrapped_f
    return self(f, *args, **kw)
  File "/opt/stack/tempest/.tox/tempest/lib/python3.10/site-packages/tenacity/__init__.py", line 467, in __call__
    do = self.iter(retry_state=retry_state)
  File "/opt/stack/tempest/.tox/tempest/lib/python3.10/site-packages/tenacity/__init__.py", line 368, in iter
    result = action(retry_state)
  File "/opt/stack/tempest/.tox/tempest/lib/python3.10/site-packages/tenacity/__init__.py", line 390, in <lambda>
    self._add_action_func(lambda rs: rs.outcome.result())
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/opt/stack/tempest/.tox/tempest/lib/python3.10/site-packages/tenacity/__init__.py", line 470, in __call__
    result = fn(*args, **kwargs)
  File "/opt/stack/tempest/.tox/tempest/lib/python3.10/site-packages/neutron_tempest_plugin/common/ssh.py", line 172, in exec_command
    return super(Client, self).exec_command(cmd=cmd, encoding=encoding)
  File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 187, in exec_command
    ssh = self._get_ssh_connection()
  File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 155, in _get_ssh_connection
    raise exceptions.SSHTimeout(host=self.host,
tempest.lib.exceptions.SSHTimeout: Connection to the 172.24.5.36 via SSH timed out.
User: cirros, Password: None

as per console log:-
### ifconfig -a
eth0 Link encap:Ethernet HWaddr FA:16:3E:24:0A:F6
          inet addr:169.254.208.255 Bcast:169.254.255.255 Mask:255.255.0.0
          inet6 addr: fe80::f816:3eff:fe24:af6/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1380 Metric:1
          RX packets:16 errors:0 dropped:0 overruns:0 frame:0
          TX packets:63 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1276 (1.2 KiB) TX bytes:4194 (4.0 KiB)

Example failure:-
- https://c0aee8855dbd0037f598-9d06690275593a686ed77270ace0c4ac.ssl.cf1.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-linuxbridge-tempest-plugin-nftables/575af76/testr_results.html
- https://zuul.opendev.org/t/openstack/build/cc4b44f0b09746e6b43f159b9f24c686
- https://zuul.opendev.org/t/openstack/build/791bb2725c9f4aa69f537f033b3c3ba2

Linux bridge jobs failing more frequently:-
- https://zuul.openstack.org/builds?job_name=neutron-tempest-plugin-linuxbridge&job_name=neutron-linuxbridge-tempest-plugin-nftables&project=openstack%2Fneutron&branch=master&skip=0

Seems to be triggered by https://review.opendev.org/c/openstack/neutron/+/923625/1/neutron/agent/dhcp/agent.py#371

yatin (yatinkarel)
Changed in neutron:
status: New → Triaged
importance: Undecided → Critical
tags: added: gate-failure linuxbridge ovs
Changed in neutron:
assignee: nobody → yatin (yatinkarel)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/924213

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/924213
Committed: https://opendev.org/openstack/neutron/commit/eb09fe5c924724bbe173a01d7a9170a830c58709
Submitter: "Zuul (22348)"
Branch: master

commit eb09fe5c924724bbe173a01d7a9170a830c58709
Author: yatin <email address hidden>
Date: Tue Jul 16 11:10:31 2024 +0000

    Revert "[DHCP] Lock the execution of ``_dhcp_ready_ports_loop``"

    This reverts commit 928f41f1feac6511b4bb67e6211b4f06a9b7ca56.

    Reason for revert: Jobs failing randomly as mentioned in lp#2073251

    Change-Id: Ib4ea8a31f785cd52407c1aa241501046e5e295e2
    Related-Bug: #2070376
    Related-Bug: #2073251

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.