Port status goes BUILD when migrating non-sriov instance in sriov setting.

Bug #2072154 reported by Seyeong Kim
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Seyeong Kim
neutron (Ubuntu)
Status tracked in Oracular
Focal
New
Undecided
Seyeong Kim
Jammy
In Progress
Undecided
Seyeong Kim
Noble
In Progress
Undecided
Seyeong Kim
Oracular
In Progress
Undecided
Seyeong Kim

Bug Description

[ Impact ]

Port status goes BUILD when migrating non-sriov instance in sriov setting.

[ Test Plan ]

1. Deploy OpenStack using Juju & Charms ( upstream also has the same code )
2. Enable SRIOV
3. create a vm without sriov nic. (test)
4. migrate it to another host
- openstack server migrate --live-migration --os-compute-api-version 2.30 --host node-04.maas test
5. check port status
- https://paste.ubuntu.com/p/RKGnP76MvB/

[ Where problems could occur ]

this patch is related to sriov agent. it adds checking if port is sriov or not. so it could be possible that sriov port can be handled inproperly.

[ Other Info ]

nova-compute has neutron-sriov-nic-agent and neutron-ovn-metadata-agent

So far, I've checked that

ovn_monitor change it to ACTIVE but sriov-nic-agent change it back to BUILD by calling _get_new_status

./plugins/ml2/drivers/mech_sriov/agent/sriov_nic_agent.py
binding_activate
- get_device_details_from_port_id
- get_device_details
- _get_new_status < this makes status BUILD.

so as running order is not fixed, sometimes it goes ACTIVE, sometimes BUILD.

Related branches

Seyeong Kim (seyeongkim)
tags: added: sts
Changed in neutron:
status: New → In Progress
Seyeong Kim (seyeongkim)
Changed in neutron:
assignee: nobody → Seyeong Kim (seyeongkim)
tags: added: ovn sriov-pci-pt
Revision history for this message
Lajos Katona (lajos-katona) wrote :

Thanks for reporting, could you please give some extra details.
I suppose you read the suggested documentation to see the limitation of sriov with OVN (https://docs.openstack.org/neutron/latest/admin/ovn/sriov.html & https://docs.openstack.org/neutron/latest/admin/ovn/sriov.html )

If I understand well you have 2 hosts where sriov is enabled and sriov-agent is running, am I right?

Changed in neutron:
importance: Undecided → High
Revision history for this message
Seyeong Kim (seyeongkim) wrote :

yes

I uploaded (just tried) patch like this

https://review.opendev.org/c/openstack/neutron/+/923467

you could understand current situation better with it.

Thank!

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/923467
Committed: https://opendev.org/openstack/neutron/commit/a311606fcdae488e76c29e0e5e4035f8da621a34
Submitter: "Zuul (22348)"
Branch: master

commit a311606fcdae488e76c29e0e5e4035f8da621a34
Author: Seyeong Kim <email address hidden>
Date: Thu Jul 4 06:23:59 2024 +0000

    Checking pci_slot to avoid changing staus to BUILD forever

    Currently when sriov agent is enabled and migrating a non-sriov
    instance, non-sriov port status is frequently set to BUILD
    instead of ACTIVE.
    This is because the 'binding_activate' function in sriov-nic-agent sets it
    BUILD with get_device_details_from_port_id(as it calls _get_new_status).

    This patch checks network_ports in binding_activate and
    skip binding port if it is not sriov port

    Closes-Bug: #2072154
    Change-Id: I2d7702e17c75c96ca2f29749dccab77cb2f4bcf4

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/2024.1)

Fix proposed to branch: stable/2024.1
Review: https://review.opendev.org/c/openstack/neutron/+/923719

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/neutron/+/923773

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/neutron/+/923774

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2024.1)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/923719
Committed: https://opendev.org/openstack/neutron/commit/59bc8e476f17dee323973a0e6ea4cd4c343f77b6
Submitter: "Zuul (22348)"
Branch: stable/2024.1

commit 59bc8e476f17dee323973a0e6ea4cd4c343f77b6
Author: Seyeong Kim <email address hidden>
Date: Thu Jul 4 06:23:59 2024 +0000

    Checking pci_slot to avoid changing staus to BUILD forever

    Currently when sriov agent is enabled and migrating a non-sriov
    instance, non-sriov port status is frequently set to BUILD
    instead of ACTIVE.
    This is because the 'binding_activate' function in sriov-nic-agent sets it
    BUILD with get_device_details_from_port_id(as it calls _get_new_status).

    This patch checks network_ports in binding_activate and
    skip binding port if it is not sriov port

    Closes-Bug: #2072154
    Change-Id: I2d7702e17c75c96ca2f29749dccab77cb2f4bcf4
    (cherry picked from commit a311606fcdae488e76c29e0e5e4035f8da621a34)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/923773
Committed: https://opendev.org/openstack/neutron/commit/1da102e4024a1c7398179b29bd63e2ba42b19000
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit 1da102e4024a1c7398179b29bd63e2ba42b19000
Author: Seyeong Kim <email address hidden>
Date: Thu Jul 4 06:23:59 2024 +0000

    Checking pci_slot to avoid changing staus to BUILD forever

    Currently when sriov agent is enabled and migrating a non-sriov
    instance, non-sriov port status is frequently set to BUILD
    instead of ACTIVE.
    This is because the 'binding_activate' function in sriov-nic-agent sets it
    BUILD with get_device_details_from_port_id(as it calls _get_new_status).

    This patch checks network_ports in binding_activate and
    skip binding port if it is not sriov port

    Closes-Bug: #2072154
    Change-Id: I2d7702e17c75c96ca2f29749dccab77cb2f4bcf4
    (cherry picked from commit a311606fcdae488e76c29e0e5e4035f8da621a34)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/923774
Committed: https://opendev.org/openstack/neutron/commit/27eee0b9e85ec23c54c4c907e2c80fe2d4609221
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit 27eee0b9e85ec23c54c4c907e2c80fe2d4609221
Author: Seyeong Kim <email address hidden>
Date: Thu Jul 4 06:23:59 2024 +0000

    Checking pci_slot to avoid changing staus to BUILD forever

    Currently when sriov agent is enabled and migrating a non-sriov
    instance, non-sriov port status is frequently set to BUILD
    instead of ACTIVE.
    This is because the 'binding_activate' function in sriov-nic-agent sets it
    BUILD with get_device_details_from_port_id(as it calls _get_new_status).

    This patch checks network_ports in binding_activate and
    skip binding port if it is not sriov port

    Closes-Bug: #2072154
    Change-Id: I2d7702e17c75c96ca2f29749dccab77cb2f4bcf4
    (cherry picked from commit a311606fcdae488e76c29e0e5e4035f8da621a34)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (unmaintained/zed)

Fix proposed to branch: unmaintained/zed
Review: https://review.opendev.org/c/openstack/neutron/+/923838

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 24.0.1

This issue was fixed in the openstack/neutron 24.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 22.2.0

This issue was fixed in the openstack/neutron 22.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 23.2.0

This issue was fixed in the openstack/neutron 23.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (unmaintained/zed)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/923838
Committed: https://opendev.org/openstack/neutron/commit/aa68e695e1b9cca8a45e772edab471cd28212f42
Submitter: "Zuul (22348)"
Branch: unmaintained/zed

commit aa68e695e1b9cca8a45e772edab471cd28212f42
Author: Seyeong Kim <email address hidden>
Date: Thu Jul 4 06:23:59 2024 +0000

    Checking pci_slot to avoid changing staus to BUILD forever

    Currently when sriov agent is enabled and migrating a non-sriov
    instance, non-sriov port status is frequently set to BUILD
    instead of ACTIVE.
    This is because the 'binding_activate' function in sriov-nic-agent sets it
    BUILD with get_device_details_from_port_id(as it calls _get_new_status).

    This patch checks network_ports in binding_activate and
    skip binding port if it is not sriov port

    Closes-Bug: #2072154
    Change-Id: I2d7702e17c75c96ca2f29749dccab77cb2f4bcf4
    (cherry picked from commit a311606fcdae488e76c29e0e5e4035f8da621a34)
    (cherry picked from commit 27eee0b9e85ec23c54c4c907e2c80fe2d4609221)

tags: added: in-unmaintained-zed
Seyeong Kim (seyeongkim)
description: updated
Seyeong Kim (seyeongkim)
Changed in neutron (Ubuntu Oracular):
status: New → In Progress
assignee: nobody → Seyeong Kim (seyeongkim)
Changed in neutron (Ubuntu Focal):
assignee: nobody → Seyeong Kim (seyeongkim)
Changed in neutron (Ubuntu Jammy):
assignee: nobody → Seyeong Kim (seyeongkim)
Changed in neutron (Ubuntu Noble):
assignee: nobody → Seyeong Kim (seyeongkim)
Seyeong Kim (seyeongkim)
Changed in neutron (Ubuntu Noble):
status: New → In Progress
Changed in neutron (Ubuntu Jammy):
status: New → In Progress
James Page (james-page)
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.