neutron-sanity-check command fails if netdev datapath is used

Bug #1842517 reported by Deepak Tiwari on 2019-09-03
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Critical
Deepak Tiwari

Bug Description

If ovs-dpdk is being used, in containerized openstack deployment, restarting the neutron pods sometimes leads to neutron-sanity-check command getting failed. This script tries to create several bridges but with 'system' datapath_type. Even though 'netdev' datapath_type bridges were already created by core neutron_ovs_agent process.

steps:
------
1. Deploy ovs-dpdk in a containerized environment (using openstack-helm for ex.)
2. Deploy neutron pods
3. First time neutron-ovs-agent pods shall come up fine (and it shall create ovs bridges with netdev datapath_type)
4. Then restart neutron pods multiple times unless following issue is observed
5. The init containers shall run neutron-sanity-check which tries to create ovs bridges with 'system' datapath_type which fails randomly....

Logs:
------
+ OVS_SOCKET=/run/openvswitch/db.sock
+ chown neutron: /run/openvswitch/db.sock
+ DPDK_CONFIG_FILE=/tmp/dpdk.conf
+ DPDK_CONFIG=
+ '[' '!' -f /tmp/dpdk.conf ']'
++ cat /tmp/dpdk.conf
+ DPDK_CONFIG='{"bonds":[{"bridge":"br-phy-bond0","migrate_ip":false,"mtu":9000,"n_rxq":4,"n_rxq_size":1024,"n_txq_size":1024,"name":"dpdkbond0","nics":[{"name":"dpdk_b0s0","pci_id":"0000:5e:00.0","vf_index":0},{"name":"dpdk_b0s1","pci_id":"0000:87:00.1","vf_index":0}],"ovs_options":"bond_mode=active-backup"}],"bridges":[{"name":"br-phy-bond0"}],"driver":"vfio-pci","enabled":true,"nics":[]}'
+ neutron-sanity-check --version
+ timeout 3m neutron-sanity-check --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/openvswitch_agent.ini --ovsdb_native --nokeepalived_ipv6_support
Guru meditation now registers SIGUSR1 and SIGUSR2 by default for backward compatibility. SIGUSR1 will no longer be registered in a future release, so please use SIGUSR2 to generate reports.
2019-08-31 04:07:15.696 1161 INFO neutron.common.config [-] Logging enabled!
2019-08-31 04:07:15.696 1161 INFO neutron.common.config [-] /var/lib/openstack/bin/neutron-sanity-check version 10.0.8.dev105
2019-08-31 04:07:16.572 1161 INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connecting...
2019-08-31 04:07:16.572 1161 INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connected
2019-08-31 04:07:26.612 1161 CRITICAL neutron [-] TimeoutException: Commands [AddBridgeCommand(datapath_type=system, may_exist=True, name=patchtest-4e35d), DbAddCommand(column=protocols, record=patchtest-4e35d, values=('OpenFlow10',), table=Bridge), DbSetCommand(table=Bridge, col_values=(('other_config', {'mac-table-size': '50000'}),), record=patchtest-4e35d)] exceeded timeout 10 seconds
2019-08-31 04:07:26.612 1161 ERROR neutron Traceback (most recent call last):
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/bin/neutron-sanity-check", line 10, in <module>
2019-08-31 04:07:26.612 1161 ERROR neutron sys.exit(main())
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/cmd/sanity_check.py", line 394, in main
2019-08-31 04:07:26.612 1161 ERROR neutron return 0 if all_tests_passed() else 1
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/cmd/sanity_check.py", line 381, in all_tests_passed
2019-08-31 04:07:26.612 1161 ERROR neutron return all(opt.callback() for opt in OPTS if cfg.CONF.get(opt.name))
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/cmd/sanity_check.py", line 381, in <genexpr>
2019-08-31 04:07:26.612 1161 ERROR neutron return all(opt.callback() for opt in OPTS if cfg.CONF.get(opt.name))
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/cmd/sanity_check.py", line 79, in check_ovs_patch
2019-08-31 04:07:26.612 1161 ERROR neutron result = checks.patch_supported()
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/cmd/sanity/checks.py", line 74, in patch_supported
2019-08-31 04:07:26.612 1161 ERROR neutron with ovs_lib.OVSBridge(name) as br:
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/agent/common/ovs_lib.py", line 692, in __enter__
2019-08-31 04:07:26.612 1161 ERROR neutron self.create()
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/agent/common/ovs_lib.py", line 258, in create
2019-08-31 04:07:26.612 1161 ERROR neutron FAILMODE_SECURE))
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/agent/ovsdb/api.py", line 79, in __exit__
2019-08-31 04:07:26.612 1161 ERROR neutron self.result = self.commit()
2019-08-31 04:07:26.612 1161 ERROR neutron File "/var/lib/openstack/local/lib/python2.7/site-packages/neutron/agent/ovsdb/impl_idl.py", line 73, in commit
2019-08-31 04:07:26.612 1161 ERROR neutron 'timeout': self.timeout})
2019-08-31 04:07:26.612 1161 ERROR neutron TimeoutException: Commands [AddBridgeCommand(datapath_type=system, may_exist=True, name=patchtest-4e35d), DbAddCommand(column=protocols, record=patchtest-4e35d, values=('OpenFlow10',), table=Bridge), DbSetCommand(table=Bridge, col_values=(('other_config', {'mac-table-size': '50000'}),), record=patchtest-4e35d)] exceeded timeout 10 seconds

Changed in neutron:
assignee: nobody → Deepak Tiwari (deepak.tiwari)
Changed in neutron:
status: New → In Progress
Nate Johnston (nate-johnston) wrote :

Marking this as critical as it would be impactful for operators who restart neutron and it doesn't come back.

Changed in neutron:
importance: Undecided → Critical
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers