Random Tempest test failures(SSH failure) in openvswitch jobs

Bug #1959564 reported by yatin
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Oleg Bondarev

Bug Description

Seen multiple similar occurences in Stable/wallaby patches, where tempest tests fails with ssh to VM Timeouts, some examples:-
  - https://cfaa2d1e4f6a936642aa-ae5561c9d080274a217713c4553af257.ssl.cf5.rackcdn.com/824022/2/check/neutron-tempest-plugin-scenario-openvswitch-wallaby/a7c128e/testr_results.html
  - https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_803/824022/2/check/neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid-wallaby/803c276/testr_results.html
  - https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e19/824022/2/check/neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid-wallaby/e19d9a7/testr_results.html
  - https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_6b0/826830/1/check/neutron-tempest-plugin-scenario-openvswitch-wallaby/6b05e5b/testr_results.html
  - https://f529caf5e8a0adc3d959-479aba3a0d5645603ea5f6db22bcd24f.ssl.cf5.rackcdn.com/826830/1/check/neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid-wallaby/9ad9f43/testr_results.html

In some failures seeing metadata request failed and thus ssh failed, and in some metadata requests passed, but ssh failed, so may be multiple issues are there

Builds:- https://zuul.openstack.org/builds?job_name=neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid-wallaby&job_name=neutron-tempest-plugin-scenario-openvswitch-wallaby&project=openstack%2Fneutron&branch=stable%2Fwallaby&skip=0

Revision history for this message
Lajos Katona (lajos-katona) wrote :
Revision history for this message
Lajos Katona (lajos-katona) wrote :
Download full text (3.3 KiB)

hmm, but really much frequently on wallaby:
Builds with matching logs 13/100:
+----------------------------------+---------------------+----------+-----------------------------------+----------------+---------------------------------------------------------------------+
| uuid | finished | pipeline | review | branch | job |
+----------------------------------+---------------------+----------+-----------------------------------+----------------+---------------------------------------------------------------------+
| 6b05e5b8c191404c83ead94eb7fcb4cb | 2022-01-31T06:49:49 | check | https://review.opendev.org/826830 | stable/wallaby | neutron-tempest-plugin-scenario-openvswitch-wallaby |
| bd7a8b63c76148ef8d8bd9ea0b1efc8e | 2022-01-29T06:26:45 | gate | https://review.opendev.org/821190 | master | neutron-ovs-tempest-multinode-full |
| 9ad9f43713704bceb5fc6fe0b9c23d4e | 2022-01-28T18:47:55 | check | https://review.opendev.org/826830 | stable/wallaby | neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid-wallaby |
| 31410dcda3e948828e1649e7c261f761 | 2022-01-28T19:33:39 | check | https://review.opendev.org/826103 | stable/wallaby | neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid-wallaby |
| 21e1db3628d345c1a3aa6ca610bfcda9 | 2022-01-28T17:08:57 | check | https://review.opendev.org/826830 | stable/wallaby | neutron-tempest-plugin-scenario-openvswitch-wallaby |
| b0ed1a5eecf842ab886ef9d8f03a9f88 | 2022-01-28T12:44:18 | check | https://review.opendev.org/826103 | stable/wallaby | neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid-wallaby |
| 5425f9362d604dc2ac111f5ed8ac31e6 | 2022-01-28T12:04:15 | check | https://review.opendev.org/826103 | stable/wallaby | neutron-tempest-plugin-scenario-openvswitch-wallaby |
| 823b2b7354494201b447fcb751526b9d | 2022-01-28T10:52:09 | gate | https://review.opendev.org/826449 | master | neutron-ovs-tempest-multinode-full |
| bd0e6d7fd6d14dc5b78c4755ae475f96 | 2022-01-28T06:58:24 | check | https://review.opendev.org/826830 | stable/wallaby | neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid-wallaby |
| 29a79d67b07c4f12822bc48576db6e49 | 2022-01-28T07:01:01 | check | https://review.opendev.org/826830 | stable/wallaby | neutron-tempest-plugin-scenario-openvswitch-wallaby |
| 2772ef0b211b454888f8561c9e89bd5f | 2022-01-27T17:38:02 | check | https://review.opendev.org/826103 | stable/wallaby | neutron-tempest-plugin-scenario-openvswitch-wallaby |
| fbbc43511800455b886268acea48d87f | 2022-01-27T17:29:14 | check | https://review.opendev.org/826103 | stable/wallaby | neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid-wallaby |
| 5073c00f0964452785045d32058625fc | 2022-01-27T13:56:58 | check | https://review.opendev.org/825077 | stable/wallaby | neutron-tempest-plugin-scenario-openvswitch-wallaby |
+---------------------------------...

Read more...

Changed in neutron:
status: New → Triaged
Revision history for this message
yatin (yatinkarel) wrote :

https://review.opendev.org/q/Ib6b70114efb140cf1393b57ebc350fea4b0a2443 looks suspecious considering the timings of failures started in wallaby, similar failure seeing in stable/victoria cherry-pick which is not merged yet.

Revision history for this message
yatin (yatinkarel) wrote :
Changed in neutron:
importance: Undecided → High
assignee: nobody → Oleg Bondarev (obondarev)
Changed in neutron:
status: Triaged → In Progress
tags: added: wallabu-backport-potential xena-backport-potential
tags: added: wallaby-backport-potential
removed: wallabu-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/827315
Committed: https://opendev.org/openstack/neutron/commit/0ddca284542aed89df4a22607a2da03f193f083c
Submitter: "Zuul (22348)"
Branch: master

commit 0ddca284542aed89df4a22607a2da03f193f083c
Author: Oleg Bondarev <email address hidden>
Date: Tue Feb 1 18:56:02 2022 +0300

    Make sure "dead vlan" ports cannot transmit packets

    https://review.opendev.org/c/openstack/neutron/+/820897 added
    a dead vlan flow that pushes the dead vlan tag onto frames
    belonging to dead ports before these ports are reassigned to
    their proper vlans. However add_flow and delete_flows race and
    delete_flows may run before add_flow, in this case deleting 0 flows
    but not giving us a chance to detect this: neither does it throw
    an error nor does it return the number of deleted flows.
    This leads to port staying inaccessible forever and hence
    breaks corresponding DHCP or router.

    Current patch suggests another approach to make sure no packets are
    leaked from newly plugged ports: setting their "vlan_mode" attribute
    to "trunk" and "trunks"=[4095] (along with assigning dead VLAN tag).
    With this OVS normal pipeline will allow only packets tagged with 4095
    from such ports [1], which normally not happens, but even if it does -
    default rule in br-int will drop them anyway.
    Thus untagged packets from such ports will also be dropped until
    ovs agent sets proper VLAN tag and clears vlan_mode to default
    ("access").

    This approach avoids the race between dhcp/l3 and ovs agents because
    dhcp/l3 agents no longer modify flow table.

    This partially reverts commit 7aae31c9f9ed938760ca0be3c461826b598c7004

    [1] https://docs.openvswitch.org/en/latest/ref/ovs-actions.7/?highlight=ovs-actions#the-ovs-normal-pipeline

    Closes-Bug: #1930414
    Closes-Bug: #1959564
    Change-Id: I0391dd24224f8656a09ddb002e7dae8783ba37a4

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/neutron/+/828230

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/neutron/+/828231

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/828230
Committed: https://opendev.org/openstack/neutron/commit/78c63d4ec6a94ba7bf9efb576850f7b38f1f8722
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 78c63d4ec6a94ba7bf9efb576850f7b38f1f8722
Author: Oleg Bondarev <email address hidden>
Date: Tue Feb 1 18:56:02 2022 +0300

    Make sure "dead vlan" ports cannot transmit packets

    https://review.opendev.org/c/openstack/neutron/+/820897 added
    a dead vlan flow that pushes the dead vlan tag onto frames
    belonging to dead ports before these ports are reassigned to
    their proper vlans. However add_flow and delete_flows race and
    delete_flows may run before add_flow, in this case deleting 0 flows
    but not giving us a chance to detect this: neither does it throw
    an error nor does it return the number of deleted flows.
    This leads to port staying inaccessible forever and hence
    breaks corresponding DHCP or router.

    Current patch suggests another approach to make sure no packets are
    leaked from newly plugged ports: setting their "vlan_mode" attribute
    to "trunk" and "trunks"=[4095] (along with assigning dead VLAN tag).
    With this OVS normal pipeline will allow only packets tagged with 4095
    from such ports [1], which normally not happens, but even if it does -
    default rule in br-int will drop them anyway.
    Thus untagged packets from such ports will also be dropped until
    ovs agent sets proper VLAN tag and clears vlan_mode to default
    ("access").

    This approach avoids the race between dhcp/l3 and ovs agents because
    dhcp/l3 agents no longer modify flow table.

    This partially reverts commit 7aae31c9f9ed938760ca0be3c461826b598c7004

    [1] https://docs.openvswitch.org/en/latest/ref/ovs-actions.7/?highlight=ovs-actions#the-ovs-normal-pipeline

    Closes-Bug: #1930414
    Closes-Bug: #1959564
    Change-Id: I0391dd24224f8656a09ddb002e7dae8783ba37a4
    (cherry picked from commit 0ddca284542aed89df4a22607a2da03f193f083c)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/828231
Committed: https://opendev.org/openstack/neutron/commit/9d5cea0e2bb85b3b6ea27eb71279c57c419b0102
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 9d5cea0e2bb85b3b6ea27eb71279c57c419b0102
Author: Oleg Bondarev <email address hidden>
Date: Tue Feb 1 18:56:02 2022 +0300

    Make sure "dead vlan" ports cannot transmit packets

    https://review.opendev.org/c/openstack/neutron/+/820897 added
    a dead vlan flow that pushes the dead vlan tag onto frames
    belonging to dead ports before these ports are reassigned to
    their proper vlans. However add_flow and delete_flows race and
    delete_flows may run before add_flow, in this case deleting 0 flows
    but not giving us a chance to detect this: neither does it throw
    an error nor does it return the number of deleted flows.
    This leads to port staying inaccessible forever and hence
    breaks corresponding DHCP or router.

    Current patch suggests another approach to make sure no packets are
    leaked from newly plugged ports: setting their "vlan_mode" attribute
    to "trunk" and "trunks"=[4095] (along with assigning dead VLAN tag).
    With this OVS normal pipeline will allow only packets tagged with 4095
    from such ports [1], which normally not happens, but even if it does -
    default rule in br-int will drop them anyway.
    Thus untagged packets from such ports will also be dropped until
    ovs agent sets proper VLAN tag and clears vlan_mode to default
    ("access").

    This approach avoids the race between dhcp/l3 and ovs agents because
    dhcp/l3 agents no longer modify flow table.

    This partially reverts commit 7aae31c9f9ed938760ca0be3c461826b598c7004

    [1] https://docs.openvswitch.org/en/latest/ref/ovs-actions.7/?highlight=ovs-actions#the-ovs-normal-pipeline

    Closes-Bug: #1930414
    Closes-Bug: #1959564
    Change-Id: I0391dd24224f8656a09ddb002e7dae8783ba37a4
    (cherry picked from commit 0ddca284542aed89df4a22607a2da03f193f083c)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 20.0.0.0rc1

This issue was fixed in the openstack/neutron 20.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 18.3.0

This issue was fixed in the openstack/neutron 18.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 19.2.0

This issue was fixed in the openstack/neutron 19.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/874658

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/874658
Committed: https://opendev.org/openstack/neutron/commit/f5dc708e1a8138aa79eff07db68ff59d7b5b6a94
Submitter: "Zuul (22348)"
Branch: master

commit f5dc708e1a8138aa79eff07db68ff59d7b5b6a94
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Fri Feb 17 10:09:54 2023 +0100

    Check port.tag is not DEAD_VLAN_TAG in ``DHCPAgentOVSTestFramework``

    Check that the port added has no tag DEAD_VLAN_TAG.

    Related-Bug: #2007992
    Related-Bug: #1959564
    Change-Id: I68760a1833d32201a63d20c8696916a8bde621a9

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.