Revert resize (on different host) + ovs network backend with iptables security group firewall driver (aka hybrid plug) is broken

Bug #1952003 reported by Artom Lifshitz
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Unassigned

Bug Description

$subject

First noticed in internal Red Hat CI of OSP 17 (based on stable/wallaby), reproduced in upstream DNM patch [2].

tl;dr is - Nova waits for a "bind-time" external event from Neutron when it updates the port binding back to the original host during the resize revert, but Neutron never sends it. Nova times out, and the resize revert fails.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2011433
[2] https://review.opendev.org/c/openstack/nova/+/817303

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

There is an upstream ML [0] (without technical details though) that looks related to Nova<->Neutron external events problems, but with iptables firewall, and OVN as well.

[0] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025892.html

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Logic in [0] may have been impacted by [1] (but that's only Sean's suggestion though)

More background can be found here [2][3]

[0] https://review.opendev.org/c/openstack/nova/+/667177
[1] https://review.opendev.org/c/openstack/neutron/766277
[2] https://review.opendev.org/c/openstack/nova/+/667177
[3] https://review.opendev.org/c/openstack/nova/+/595069

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/819494

Changed in nova:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/817303
Committed: https://opendev.org/openstack/nova/commit/ded6168ad729e747fc976ca3cdb8baf971fbc31a
Submitter: "Zuul (22348)"
Branch: master

commit ded6168ad729e747fc976ca3cdb8baf971fbc31a
Author: Artom Lifshitz <email address hidden>
Date: Tue Nov 9 15:14:57 2021 -0500

    Add nova-ovs-hybrid-plug job

    We have a gap in our testing of the exernal events interaction between
    Nova and Neutron. The nova-next job tests with the OVS network
    backend, and Neutron has jobs that test the OVN network backend, but
    nothing tests OVS + the iptables security group firewall driver, aka
    "hybrid plug". Add a job to test that.

    Related-bug: 1952003
    Change-Id: Ie42eaa2a39ef097b0eb69b8863bb342bae007fff

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/819494
Committed: https://opendev.org/openstack/nova/commit/0b0f40d1b308b29da537859b72080488560c23d4
Submitter: "Zuul (22348)"
Branch: master

commit 0b0f40d1b308b29da537859b72080488560c23d4
Author: Artom Lifshitz <email address hidden>
Date: Fri Nov 26 14:36:09 2021 -0500

    Revert "Revert resize: wait for events according to hybrid plug"

    This reverts commit 7a7a223602ca5aa0aca8f65a6ab143f1d8f8ec1b.

    That commit was added because - tl'dr - upon revert resize, Neutron
    with the OVS backend and the iptables security group driver would send
    us the network-vif-plugged event as soon as we updated the port
    binding.

    That behaviour has changed with commit 66c7f00e1d9. With that commit,
    we started unplugging the vifs on the source compute host when doing a
    resize. When reverting the resize, the vifs had to be re-plugged again,
    regarldess of the networking backend in use. This renders commit
    7a7a223602ca5aa0aca8f65a6ab143f1d8f8ec1b. pointless, and it can be
    reverted.

    Conflicts - most have to do with context around this commit's code:

    nova/compute/manager.py

        a2984b647a4 added provider_mappings to
        _finish_revert_resize_network_migrate_finish()'s signature

        750aef54b19 started using
        _finish_revert_resize_network_migrate_finish() in
        _finish_revert_snapshot_based_resize_at_source()

    nova/network/model.py

        8b33ac06445 added get_live_migration_plug_time_events() and
        has_live_migration_plug_time_event()

        7da94440db1 added has_port_with_allocation()

    nova/objects/migration.py

        f203da38387 added is_resize() and is_live_migration()

    nova/tests/unit/compute/test_compute.py

        a0e60feb3ec added request_spec to the test

    nova/tests/unit/compute/test_compute_mgr.py

        be278006a58 added unit tests below ours

    nova/tests/unit/network/test_network_info.py

        7da94440db1 (again) added tests for has_port_with_allocation()

    nova/tests/unit/virt/libvirt/test_driver.py and
    nova/virt/libvirt/driver.py are different in that attempting to
    identify individual conflicts is a pointless exercise, as so much has
    changed (mdev, vtmp, the recent wait for events during hard reboot
    workaround config option, etc). They can be treated as
    manual removal of any code that had to do with the bind-time events
    logic (though guided by the conflict markers in git).

    TODO(artom) There was a follow up commit,
    78a08d44ea68b31e27ce344f452756886ad309bd, that added the migration
    parameter to finish_revert_migration(). This is no longer needed, as
    the migration was only used to obtain plug-time events. We'll have to
    undo that as well.

    Closes-bug: 1952003
    Change-Id: I3cb39a9ec2c260f422b3c48122b9db512cdd799b

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/xena)

Related fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/828413

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/828414

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/wallaby)

Related fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/nova/+/828418

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/nova/+/828419

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 25.0.0.0rc1

This issue was fixed in the openstack/nova 25.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/nova/+/828413
Committed: https://opendev.org/openstack/nova/commit/b414fe18f1a9a688a0291e97eb00e3ce3b2f4a52
Submitter: "Zuul (22348)"
Branch: stable/xena

commit b414fe18f1a9a688a0291e97eb00e3ce3b2f4a52
Author: Artom Lifshitz <email address hidden>
Date: Tue Nov 9 15:14:57 2021 -0500

    Add nova-ovs-hybrid-plug job

    We have a gap in our testing of the exernal events interaction between
    Nova and Neutron. The nova-next job tests with the OVS network
    backend, and Neutron has jobs that test the OVN network backend, but
    nothing tests OVS + the iptables security group firewall driver, aka
    "hybrid plug". Add a job to test that.

    Related-bug: 1952003
    Change-Id: Ie42eaa2a39ef097b0eb69b8863bb342bae007fff
    (cherry picked from commit ded6168ad729e747fc976ca3cdb8baf971fbc31a)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/nova/+/828414
Committed: https://opendev.org/openstack/nova/commit/c3ebe0f39e82cf1df6886cfd03fd8de62548fb26
Submitter: "Zuul (22348)"
Branch: stable/xena

commit c3ebe0f39e82cf1df6886cfd03fd8de62548fb26
Author: Artom Lifshitz <email address hidden>
Date: Fri Nov 26 14:36:09 2021 -0500

    Revert "Revert resize: wait for events according to hybrid plug"

    This reverts commit 7a7a223602ca5aa0aca8f65a6ab143f1d8f8ec1b.

    That commit was added because - tl'dr - upon revert resize, Neutron
    with the OVS backend and the iptables security group driver would send
    us the network-vif-plugged event as soon as we updated the port
    binding.

    That behaviour has changed with commit 66c7f00e1d9. With that commit,
    we started unplugging the vifs on the source compute host when doing a
    resize. When reverting the resize, the vifs had to be re-plugged again,
    regarldess of the networking backend in use. This renders commit
    7a7a223602ca5aa0aca8f65a6ab143f1d8f8ec1b. pointless, and it can be
    reverted.

    Backport is clean from master, and the TODO that was present in the
    commit message on master is removed, as its a driver interface change
    and can only be done on master.

    Closes-bug: 1952003
    Change-Id: I3cb39a9ec2c260f422b3c48122b9db512cdd799b
    (cherry picked from commit 0b0f40d1b308b29da537859b72080488560c23d4)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 24.1.1

This issue was fixed in the openstack/nova 24.1.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/nova/+/857423

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/nova/+/857427

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/857877

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/nova/+/828418
Committed: https://opendev.org/openstack/nova/commit/1b40a52d1795db18c98a634166ee0cf5dd642b6e
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 1b40a52d1795db18c98a634166ee0cf5dd642b6e
Author: Artom Lifshitz <email address hidden>
Date: Tue Nov 9 15:14:57 2021 -0500

    Add nova-ovs-hybrid-plug job

    We have a gap in our testing of the exernal events interaction between
    Nova and Neutron. The nova-next job tests with the OVS network
    backend, and Neutron has jobs that test the OVN network backend, but
    nothing tests OVS + the iptables security group firewall driver, aka
    "hybrid plug". Add a job to test that.

    Conflicts in .zuul.yaml due to efd28166191 which added the
    nova-live-migration-ceph job roughly where this patch adds
    nova-ovs-hybrid-plug.

    Also had to change the nova-base-irrelevant-files reference to
    dsvm-irrelevant-files, as the former hasn't been introduced in
    wallaby.

    Related-bug: 1952003
    Change-Id: Ie42eaa2a39ef097b0eb69b8863bb342bae007fff
    (cherry picked from commit ded6168ad729e747fc976ca3cdb8baf971fbc31a)
    (cherry picked from commit b414fe18f1a9a688a0291e97eb00e3ce3b2f4a52)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/nova/+/828419
Committed: https://opendev.org/openstack/nova/commit/36378de1bdfe451d683ed1028ffd5f6c7130c6ee
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 36378de1bdfe451d683ed1028ffd5f6c7130c6ee
Author: Artom Lifshitz <email address hidden>
Date: Fri Nov 26 14:36:09 2021 -0500

    Revert "Revert resize: wait for events according to hybrid plug"

    This reverts commit 7a7a223602ca5aa0aca8f65a6ab143f1d8f8ec1b.

    That commit was added because - tl'dr - upon revert resize, Neutron
    with the OVS backend and the iptables security group driver would send
    us the network-vif-plugged event as soon as we updated the port
    binding.

    That behaviour has changed with commit 66c7f00e1d9. With that commit,
    we started unplugging the vifs on the source compute host when doing a
    resize. When reverting the resize, the vifs had to be re-plugged again,
    regarldess of the networking backend in use. This renders commit
    7a7a223602ca5aa0aca8f65a6ab143f1d8f8ec1b. pointless, and it can be
    reverted.

    Backport is clean from master, and the TODO that was present in the
    commit message on master is removed, as its a driver interface change
    and can only be done on master.

    Closes-bug: 1952003
    Change-Id: I3cb39a9ec2c260f422b3c48122b9db512cdd799b
    (cherry picked from commit 0b0f40d1b308b29da537859b72080488560c23d4)
    (cherry picked from commit c3ebe0f39e82cf1df6886cfd03fd8de62548fb26)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/train)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/857877
Reason: stable/train branch of nova projects' have been tagged as End of Life. All open patches have to be abandoned in order to be able to delete the branch.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/ussuri)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/nova/+/857427
Reason: stable/ussuri branch of openstack/nova transitioned to End of Life and is about to be deleted. To be able to do that, all open patches need to be abandoned.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova wallaby-eom

This issue was fixed in the openstack/nova wallaby-eom release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/victoria)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/victoria
Review: https://review.opendev.org/c/openstack/nova/+/857423
Reason: stable/victoria branch of openstack/nova is about to be deleted. To be able to do that, all open patches need to be abandoned. Please cherry pick the patch to unmaintained/victoria if you want to further work on this patch.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.