delete router with error of failed unplugging ha interface

Bug #1629159 reported by Perry
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Kevin Benton

Bug Description

When deleting a Router, there are ERROR logs of failed unplugging ha interface. This happens in environment with stable/mitaka. What needs to note is that router could be deleted successfully after the ERROR.

Reproduce steps:
neutron router-create test
neutron router-delete test
monitor log in neutron-l3-agent.log

This problem is different from existing defects. some existing defect addressed problem of looping deleting router. some addressed problem of race between router sync and router deleting. And some defect has similar symptom which happened on different place, such as bug 1606801.

2016-09-29 06:57:11.744 6287 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', '/usr/local/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'delete', 'qrouter-74c4a209-2f42-4f45-b409-082939df0962'] create_process /opt/bbc/openstack-2016.1-bbc234/neutron/local/lib/python2.7/site-packages/neutron/agent/linux/utils.py:84
2016-09-29 06:57:11.835 6287 DEBUG neutron.agent.linux.utils [-] Exit code: 0 execute /opt/bbc/openstack-2016.1-bbc234/neutron/local/lib/python2.7/site-packages/neutron/agent/linux/utils.py:142
2016-09-29 06:57:11.836 6287 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', '/usr/local/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'kill', '-9', '10728'] create_process /opt/bbc/openstack-2016.1-bbc234/neutron/local/lib/python2.7/site-packages/neutron/agent/linux/utils.py:84
2016-09-29 06:57:11.897 6287 DEBUG neutron.agent.linux.utils [-] Exit code: 0 execute /opt/bbc/openstack-2016.1-bbc234/neutron/local/lib/python2.7/site-packages/neutron/agent/linux/utils.py:142
2016-09-29 06:57:11.898 6287 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', '/usr/local/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-74c4a209-2f42-4f45-b409-082939df0962', 'ip', 'link', 'delete', 'ha-e210e603-0c'] create_process /opt/bbc/openstack-2016.1-bbc234/neutron/local/lib/python2.7/site-packages/neutron/agent/linux/utils.py:84
2016-09-29 06:57:11.961 6287 ERROR neutron.agent.linux.utils [-] Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot open network namespace "qrouter-74c4a209-2f42-4f45-b409-082939df0962": No such file or directory

2016-09-29 06:57:11.962 6287 ERROR neutron.agent.linux.interface [-] Failed unplugging interface 'ha-e210e603-0c'
2016-09-29 06:57:11.962 6287 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', '/usr/local/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'kill', '-15', '10910'] create_process /opt/bbc/openstack-2016.1-bbc234/neutron/local/lib/python2.7/site-packages/neutron/agent/linux/utils.py:84

Perry (panxia6679)
Changed in neutron:
assignee: nobody → Perry (panxia6679)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/379908

Changed in neutron:
status: New → In Progress
tags: added: l3-ha
Changed in neutron:
importance: Undecided → Medium
Revision history for this message
Assaf Muller (amuller) wrote :

Note that the TRACE only happens with ovs_use_veth == True. From the bug report:

'ip', 'netns', 'exec', 'qrouter-74c4a209-2f42-4f45-b409-082939df0962', 'ip', 'link', 'delete', 'ha-e210e603-0c'

https://github.com/openstack/neutron/blob/master/neutron/agent/linux/interface.py#L373

That error message is only reachable if you try to delete the veth pair link. Regardless the patch is correct, HA router specific resources should be deleted before we call the super class, that is the reverse order of creation.

Revision history for this message
Perry (panxia6679) wrote :

Thanks comments from Assaf. This bug was introduced from fix for bug 1488730. However I didn't find any info why it moved calling super class before deleting ha. The calling super class should be at the end of deletion process. Subscribed <email address hidden> for comments.

Revision history for this message
Kevin Benton (kevinbenton) wrote :

Upgrading to high. This is really bad in the linux bridge case because the keepalived process can hang around transmitting VRRP packets with a VRID that is supposed to be free. If a new router gets this same VRID, none of its agents will transition to master.

Changed in neutron:
importance: Medium → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/396815

Changed in neutron:
assignee: Perry (panxia6679) → Kevin Benton (kevinbenton)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/379908
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=bc03048134f12df47e1e619d21ba394db9c52dc1
Submitter: Jenkins
Branch: master

commit bc03048134f12df47e1e619d21ba394db9c52dc1
Author: Perry Zou <email address hidden>
Date: Fri Sep 30 02:42:56 2016 +0000

    Fix "failed unplugging ha interface" error when deleting router

    Deleting router namespaces happens before deleting router ha interface.
    So it will fail when deleting router ha interface. The change
    is to remove router ha interface before deleting router namespace.

    Change-Id: I3d936701c9dac7671f12e1966449662988a0f26a
    Closes-Bug: #1629159
    Related-Bug: #1488730

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.openstack.org/396815
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=16ae4190a7cce9b7da610bc15ca0a6378ddb736d
Submitter: Jenkins
Branch: master

commit 16ae4190a7cce9b7da610bc15ca0a6378ddb736d
Author: Kevin Benton <email address hidden>
Date: Fri Nov 11 17:39:37 2016 -0800

    Add L3 HA test with linux bridge

    Adds an HA test case using the linux bridge interface and
    a test to recreate the router to ensure all cleanup was
    done appropriately on teardown.

    Related-Bug: #1629159
    Change-Id: I80b70b848ea64d5f996055edc4bfb0ec1f4ae548

Revision history for this message
Perry (panxia6679) wrote :

just saw the progress. thanks Kevin for your help.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/398165

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/398166

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/newton)

Related fix proposed to branch: stable/newton
Review: https://review.openstack.org/398167

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 10.0.0.0b1

This issue was fixed in the openstack/neutron 10.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/newton)

Reviewed: https://review.openstack.org/398165
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=bd982c721f039cd07c505a27717941cc27d5366f
Submitter: Jenkins
Branch: stable/newton

commit bd982c721f039cd07c505a27717941cc27d5366f
Author: Perry Zou <email address hidden>
Date: Fri Sep 30 02:42:56 2016 +0000

    Fix "failed unplugging ha interface" error when deleting router

    Deleting router namespaces happens before deleting router ha interface.
    So it will fail when deleting router ha interface. The change
    is to remove router ha interface before deleting router namespace.

    Change-Id: I3d936701c9dac7671f12e1966449662988a0f26a
    Closes-Bug: #1629159
    Related-Bug: #1488730
    (cherry picked from commit bc03048134f12df47e1e619d21ba394db9c52dc1)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/newton)

Reviewed: https://review.openstack.org/398167
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=79c62e05d916b6f2916bea059bbcc8ebe7b3fb8a
Submitter: Jenkins
Branch: stable/newton

commit 79c62e05d916b6f2916bea059bbcc8ebe7b3fb8a
Author: Kevin Benton <email address hidden>
Date: Fri Nov 11 17:39:37 2016 -0800

    Add L3 HA test with linux bridge

    Adds an HA test case using the linux bridge interface and
    a test to recreate the router to ensure all cleanup was
    done appropriately on teardown.

    Related-Bug: #1629159
    Change-Id: I80b70b848ea64d5f996055edc4bfb0ec1f4ae548
    (cherry picked from commit 16ae4190a7cce9b7da610bc15ca0a6378ddb736d)

tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/mitaka)

Reviewed: https://review.openstack.org/398166
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=2ee6e58307d10e3794f3bad6eadf3f9da70dbb66
Submitter: Jenkins
Branch: stable/mitaka

commit 2ee6e58307d10e3794f3bad6eadf3f9da70dbb66
Author: Perry Zou <email address hidden>
Date: Fri Sep 30 02:42:56 2016 +0000

    Fix "failed unplugging ha interface" error when deleting router

    Deleting router namespaces happens before deleting router ha interface.
    So it will fail when deleting router ha interface. The change
    is to remove router ha interface before deleting router namespace.

    Change-Id: I3d936701c9dac7671f12e1966449662988a0f26a
    Closes-Bug: #1629159
    Related-Bug: #1488730
    (cherry picked from commit bc03048134f12df47e1e619d21ba394db9c52dc1)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 9.1.1

This issue was fixed in the openstack/neutron 9.1.1 release.

tags: added: neutron-proactive-backport-potential
tags: removed: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/mitaka)

Related fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/421978

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/mitaka)

Reviewed: https://review.openstack.org/421978
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=284d32f4f26a29b1210c799298f2e596292001b0
Submitter: Jenkins
Branch: stable/mitaka

commit 284d32f4f26a29b1210c799298f2e596292001b0
Author: Kevin Benton <email address hidden>
Date: Fri Nov 11 17:39:37 2016 -0800

    Add L3 HA test with linux bridge

    Adds an HA test case using the linux bridge interface and
    a test to recreate the router to ensure all cleanup was
    done appropriately on teardown.

    Conflicts:
     neutron/tests/functional/agent/l3/test_ha_router.py

    Related-Bug: #1629159
    Change-Id: I80b70b848ea64d5f996055edc4bfb0ec1f4ae548
    (cherry picked from commit 16ae4190a7cce9b7da610bc15ca0a6378ddb736d)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 8.4.0

This issue was fixed in the openstack/neutron 8.4.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.