Traceback in unlock host when configuring pci-passthrough interfaces and sriovdp label enabled

Bug #1856587 reported by Wendy Mitchell
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Thomas Gao

Bug Description

Traceback in unlock attempt after sriovdp added

Severity
--------
Standard

Steps to Reproduce
------------------
1. Configure both controllers with pci-passthrough interfaces
2. Unlock the controllers and confirm they are unlocked, enabled and available before adding the label
Confirm system application-list reports | applied | completed
3. Lock controller-1
4. assign the sriovdp label
$ system host-label-assign controller-1 sriovdp=enabled
5. Attempt to unlock controller-1

Expected Behavior
------------------
Expected to be able to successfully unlock the host with the sriovdp label

Actual Behavior
----------------
Traceback when attempting to unlock node with label added sriovdp=enabled and the unlock fails

[sysadmin@controller-0 log(keystone_admin)]$ system host-label-list controller-1; date
----------------------------------
hostname label key label value

----------------------------------
controller-1 sriovdp enabled

Branch/Pull Time/Commit
-----------------------
2019-12-10_20-00-00

Last Pass
---------
This config was not tested previously

Timestamp/Logs
--------------
see inline

Test Activity
-------------
Regression
----------------------------------
Fri Dec 13 00:58:05 UTC 2019
[sysadmin@controller-0 log(keystone_admin)]$ system application-list
--------------------------------------------------------------------------------------+
application version manifest name manifest file status progress

--------------------------------------------------------------------------------------+
platform-integ-apps 1.0-10 platform-integration-manifest manifest.yaml applied completed

--------------------------------------------------------------------------------------+
[sysadmin@controller-0 log(keystone_admin)]$ system host-unlock controller-1

Traceback (most recent call last):

File "/usr/lib64/python2.7/site-packages/sysinv/openstack/common/rpc/amqp.py", line 437, in _process_data
**args)

File "/usr/lib64/python2.7/site-packages/sysinv/openstack/common/rpc/dispatcher.py", line 172, in dispatch
result = getattr(proxyobj, method)(ctxt, **kwargs)

File "/usr/lib64/python2.7/site-packages/sysinv/conductor/manager.py", line 1654, in configure_ihost
self._configure_controller_host(context, host)

File "/usr/lib64/python2.7/site-packages/sysinv/conductor/manager.py", line 1319, in _configure_controller_host
self._puppet.update_host_config(host)

File "/usr/lib64/python2.7/site-packages/sysinv/puppet/puppet.py", line 30, in _wrapper
func(self, *args, **kwargs)

File "/usr/lib64/python2.7/site-packages/sysinv/puppet/puppet.py", line 147, in update_host_config
config.update(puppet_plugin.obj.get_host_config(host))

File "/usr/lib64/python2.7/site-packages/sysinv/puppet/kubernetes.py", line 101, in get_host_config
config.update(self._get_host_pcidp_config(host))

File "/usr/lib64/python2.7/site-packages/sysinv/puppet/kubernetes.py", line 318, in _get_host_pcidp_config
self._get_pcidp_network_resources(),

File "/usr/lib64/python2.7/site-packages/sysinv/puppet/kubernetes.py", line 433, in _get_pcidp_network_resources
constants.INTERFACE_CLASS_PCI_PASSTHROUGH)

File "/usr/lib64/python2.7/site-packages/sysinv/puppet/kubernetes.py", line 369, in _get_pcidp_network_resources_by_ifclass
port = interface.get_sriov_interface_port(self.context, iface)

File "/usr/lib64/python2.7/site-packages/sysinv/puppet/interface.py", line 980, in get_sriov_interface_port
return interface.get_sriov_interface_port(context, iface)

File "/usr/lib64/python2.7/site-packages/sysinv/common/interface.py", line 123, in get_sriov_interface_port
assert iface['ifclass'] == constants.INTERFACE_CLASS_PCI_SRIOV

AssertionError

Reproducibility
---------------
yes

System Configuration
--------------------
HW
(R720 1-2 sriov interface configured + sriov device plugin discovers pci-passthrough interfaces as well)

tags: added: stx.retestneeded
Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: nobody → Steven Webster (swebster-wr)
summary: - Traceback in unlock host attempt after sriovdp label added
+ Traceback in unlock host when configuring pci-passthrough interfaces and
+ sriovdp label added
summary: Traceback in unlock host when configuring pci-passthrough interfaces and
- sriovdp label added
+ sriovdp label enabled
tags: added: stx.networking
Ghada Khalil (gkhalil)
description: updated
description: updated
Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.4.0 / medium priority - currently sriovdp labels are not supported on pci-passthrough interfaces, but should be simple to fix

tags: added: stx.4.0
Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: Steven Webster (swebster-wr) → Thomas Gao (tgao)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/707701

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/707701
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=227ddec6189fdabdc75d45162fc22b9af7118982
Submitter: Zuul
Branch: master

commit 227ddec6189fdabdc75d45162fc22b9af7118982
Author: Thomas Gao <email address hidden>
Date: Thu Feb 13 10:47:55 2020 -0500

    Fix device plugin port handling for pci-passthrough

    While generating the SR-IOV device plugin configuration data,
    it is necessary to get the underlying port information.
    For SR-IOV ports there is special handling required to deal
    with the case of a 'VF' subinterface. For PCI-Passthrough,
    the port can and should be accessed directly.

    Closes-Bug: 1856587

    Co-Authored-By: Steven Webster <email address hidden>

    Change-Id: I70f315669776a591e23e69c6653098e720815b99
    Signed-off-by: Thomas Gao <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :

verified unlock was successful
WP 3-6

tags: removed: stx.retestneeded
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (f/centos8)

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/716137

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (f/centos8)
Download full text (32.3 KiB)

Reviewed: https://review.opendev.org/716137
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=cb4cf4299c2ec10fb2eb03cdee3f6d78a6413089
Submitter: Zuul
Branch: f/centos8

commit 16477935845e1c27b4c9d31743e359b0aa94a948
Author: Steven Webster <email address hidden>
Date: Sat Mar 28 17:19:30 2020 -0400

    Fix SR-IOV runtime manifest apply

    When an SR-IOV interface is configured, the platform's
    network runtime manifest is applied in order to apply the virtual
    function (VF) config and restart the interface. This results in
    sysinv being able to determine and populate the puppet hieradata
    with the virtual function PCI addresses.

    A side effect of the network manifest apply is that potentially
    all platform interfaces may be brought down/up if it is determined
    that their configuration has changed. This will likely be the case
    for a system which configures SR-IOV interfaces before initial
    unlock.

    A few issues have been encountered because of this, with some
    services not behaving well when the interface they are communicating
    over suddenly goes down.

    This commit makes the SR-IOV VF configuration much more targeted
    so that only the operation of setting the desired number of VFs
    is performed.

    Closes-Bug: #1868584
    Depends-On: https://review.opendev.org/715669
    Change-Id: Ie162380d3732eb1b6e9c553362fe68cbc313ae2b
    Signed-off-by: Steven Webster <email address hidden>

commit 45c9fe2d3571574b9e0503af108fe7c1567007db
Author: Zhipeng Liu <email address hidden>
Date: Thu Mar 26 01:58:34 2020 +0800

    Add ipv6 support for novncproxy_base_url.

    For ipv6 address, we need url with below format
    [ip]:port

    Partial-Bug: 1859641

    Change-Id: I01a5cd92deb9e88c2d31bd1e16e5bce1e849fcc7
    Signed-off-by: Zhipeng Liu <email address hidden>

commit d119336b3a3b24d924e000277a37ab0b5f93aae1
Author: Andy Ning <email address hidden>
Date: Mon Mar 23 16:26:21 2020 -0400

    Fix timeout waiting for CA cert install during ansible replay

    During ansible bootstrap replay, the ssl_ca_complete_flag file is
    removed. It expects puppet platform::config::runtime manifest apply
    during system CA certificate install to re-generate it. So this commit
    updated conductor manager to run that puppet manifest even if the CA cert
    has already installed so that the ssl_ca_complete_flag file is created
    and makes ansible replay to continue.

    Change-Id: Ic9051fba9afe5d5a189e2be8c8c2960bdb0d20a4
    Closes-Bug: 1868585
    Signed-off-by: Andy Ning <email address hidden>

commit 24a533d800b2c57b84f1086593fe5f04f95fe906
Author: Zhipeng Liu <email address hidden>
Date: Fri Mar 20 23:10:31 2020 +0800

    Fix rabbitmq could not bind port to ipv6 address issue

    When we use Armada to deploy openstack service for ipv6, rabbitmq
    pod could not start listen on [::]:5672 and [::]:15672.
    For ipv6, we need an override for configuration file.

    Upstream patch link is:
    https://review.opendev.org/#/c/714027/

    Test pass for deploying rabbitmq service on both ipv...

tags: added: in-f-centos8
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.