neutron-sanity-check command fails if netdev datapath is used

Bug #1842517 reported by Deepak Tiwari
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Critical
Unassigned

Bug Description

If ovs-dpdk is being used, in containerized openstack deployment, restarting the neutron pods sometimes leads to neutron-sanity-check command getting failed. This script tries to create several bridges but with 'system' datapath_type. Even though 'netdev' datapath_type bridges were already created by core neutron_ovs_agent process.

steps:
------
1. Deploy ovs-dpdk in a containerized environment (using openstack-helm for ex.)
2. Deploy neutron pods
3. First time neutron-ovs-agent pods shall come up fine (and it shall create ovs bridges with netdev datapath_type)
4. Then restart neutron pods multiple times unless following issue is observed
5. The init containers shall run neutron-sanity-check which tries to create ovs bridges with 'system' datapath_type which fails randomly....

Logs:
------
+ OVS_SOCKET=/run/openvswitch/db.sock
+ chown neutron: /run/openvswitch/db.sock
+ DPDK_CONFIG_FILE=/tmp/dpdk.conf
+ DPDK_CONFIG=
+ '[' '!' -f /tmp/dpdk.conf ']'
++ cat /tmp/dpdk.conf
+ DPDK_CONFIG='{"bonds":[{"bridge":"br-phy-bond0","migrate_ip":false,"mtu":9000,"n_rxq":4,"n_rxq_size":1024,"n_txq_size":1024,"name":"dpdkbond0","nics":[{"name":"dpdk_b0s0","pci_id":"0000:5e:00.0","vf_index":0},{"name":"dpdk_b0s1","pci_id":"0000:87:00.1","vf_index":0}],"ovs_options":"bond_mode=active-backup"}],"bridges":[{"name":"br-phy-bond0"}],"driver":"vfio-pci","enabled":true,"nics":[]}'
+ neutron-sanity-check --version
+ timeout 3m neutron-sanity-check --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/openvswitch_agent.ini --ovsdb_native --nokeepalived_ipv6_support
Guru meditation now registers SIGUSR1 and SIGUSR2 by default for backward compatibility. SIGUSR1 will no longer be registered in a future release, so please use SIGUSR2 to generate reports.
2019-08-31 04:07:15.696 1161 INFO neutron.common.config [-] Logging enabled!
2019-08-31 04:07:15.696 1161 INFO neutron.common.config [-] /var/lib/openstack/bin/neutron-sanity-check version 10.0.8.dev105
2019-08-31 04:07:16.572 1161 INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connecting...
2019-08-31 04:07:16.572 1161 INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connected
2019-08-31 04:07:26.612 1161 CRITICAL neutron [-] TimeoutException: Commands [AddBridgeCommand(datapath_type=system, may_exist=True, name=patchtest-4e35d), DbAddCommand(column=protocols, record=patchtest-4e35d, values=('OpenFlow10',), table=Bridge), DbSetCommand(table=Bridge, col_values=(('other_config', {'mac-table-size': '50000'}),), record=patchtest-4e35d)] exceeded timeout 10 seconds
2019-08-31 04:07:26.612 1161 ERROR neutron Traceback (most recent call last):
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/bin/neutron-sanity-check", line 10, in <module>
2019-08-31 04:07:26.612 1161 ERROR neutron sys.exit(main())
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/cmd/sanity_check.py", line 394, in main
2019-08-31 04:07:26.612 1161 ERROR neutron return 0 if all_tests_passed() else 1
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/cmd/sanity_check.py", line 381, in all_tests_passed
2019-08-31 04:07:26.612 1161 ERROR neutron return all(opt.callback() for opt in OPTS if cfg.CONF.get(opt.name))
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/cmd/sanity_check.py", line 381, in <genexpr>
2019-08-31 04:07:26.612 1161 ERROR neutron return all(opt.callback() for opt in OPTS if cfg.CONF.get(opt.name))
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/cmd/sanity_check.py", line 79, in check_ovs_patch
2019-08-31 04:07:26.612 1161 ERROR neutron result = checks.patch_supported()
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/cmd/sanity/checks.py", line 74, in patch_supported
2019-08-31 04:07:26.612 1161 ERROR neutron with ovs_lib.OVSBridge(name) as br:
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/agent/common/ovs_lib.py", line 692, in __enter__
2019-08-31 04:07:26.612 1161 ERROR neutron self.create()
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/agent/common/ovs_lib.py", line 258, in create
2019-08-31 04:07:26.612 1161 ERROR neutron FAILMODE_SECURE))
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/agent/ovsdb/api.py", line 79, in __exit__
2019-08-31 04:07:26.612 1161 ERROR neutron self.result = self.commit()
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/agent/ovsdb/impl_idl.py", line 73, in commit
2019-08-31 04:07:26.612 1161 ERROR neutron 'timeout': self.timeout})
2019-08-31 04:07:26.612 1161 ERROR neutron TimeoutException: Commands [AddBridgeCommand(datapath_type=system, may_exist=True, name=patchtest-4e35d), DbAddCommand(column=protocols, record=patchtest-4e35d, values=('OpenFlow10',), table=Bridge), DbSetCommand(table=Bridge, col_values=(('other_config', {'mac-table-size': '50000'}),), record=patchtest-4e35d)] exceeded timeout 10 seconds

Changed in neutron:
assignee: nobody → Deepak Tiwari (deepak.tiwari)
Changed in neutron:
status: New → In Progress
Revision history for this message
Nate Johnston (nate-johnston) wrote :

Marking this as critical as it would be impactful for operators who restart neutron and it doesn't come back.

Changed in neutron:
importance: Undecided → Critical
Revision history for this message
Slawek Kaplonski (slaweq) wrote : auto-abandon-script

This bug has had a related patch abandoned and has been automatically un-assigned due to inactivity. Please re-assign yourself if you are continuing work or adjust the state as appropriate if it is no longer valid.

Changed in neutron:
assignee: Deepak Tiwari (deepak.tiwari) → nobody
status: In Progress → New
tags: added: timeout-abandon
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Slawek Kaplonski (<email address hidden>) on branch: master
Review: https://review.opendev.org/679808
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Changed in neutron:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/679808
Committed: https://opendev.org/openstack/neutron/commit/02030f037ad754b9b8ba3f3a657022d66c32c71e
Submitter: "Zuul (22348)"
Branch: master

commit 02030f037ad754b9b8ba3f3a657022d66c32c71e
Author: Deepak Tiwari <email address hidden>
Date: Tue Sep 3 10:18:28 2019 -0500

    ovs-dpdk support in neutron-sanity-check

    While creating bridges, pass the optional argument 'datapath_type'.
    This parameter is read from openvswitch.ini conf file.

    Closes-Bug: #1842517

    Change-Id: I05f0484636e4da6290c750a1eabd5f9d09588008

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 22.0.0.0rc1

This issue was fixed in the openstack/neutron 22.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.