stable/ussuri py38 support for keepalived-state-change monitor

Bug #1929832 reported by Edward Hope-Morley
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Invalid
Undecided
Unassigned
Ussuri
Fix Released
High
Unassigned
neutron
Fix Released
High
Edward Hope-Morley
neutron (Ubuntu)
Invalid
Undecided
Unassigned
Focal
Fix Released
High
Unassigned

Bug Description

[Impact]
Please see original bug description. Without this fix, the neutron-l3-agent is unable to teardown an HA router and leaves it partially configured on every node it was running on.

[Test Plan]
* deploy Openstack ussuri on Ubuntu Focal
* enable L3 HA
* create a router and vm on network attached to router
* disable or delete the router and check for errors like the one below
* ensure that the following line exists in /etc/neutron/rootwrap.d/l3.filters:

kill_keepalived_monitor_py38: KillFilter, root, python3.8, -15, -9

-------------------------------------------------------------------------

The victoria release of Openstack received patch [1] which allows the neutron-l3-agent to SIGKILL or SIGTERM the keepalived-state-change monitor when running under py38. This patch is needed in Ussuri for users running with py38 so we need to backport it.

The consequence of not having this is that you get the following when you delete or disable a router:

2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent [req-8c69af29-8f9c-4721-9cba-81ff4e9be92c - 9320f5ac55a04fb280d9ceb0b1106a6e - - -] Error while deleting router ab63ccd8-1197-48d0-815e-31adc40e5193: neutron_lib.exceptions.ProcessExecutionError: Exit code: 99; Stdin: ; Stdout: ; Stderr: /usr/bin/neutron-rootwrap: Unauthorized command: kill -15 2516433 (no filter matched)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/agent.py", line 512, in _safe_router_removed
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent self._router_removed(ri, router_id)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/agent.py", line 548, in _router_removed
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent self.router_info[router_id] = ri
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent self.force_reraise()
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent raise value
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/agent.py", line 545, in _router_removed
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent ri.delete()
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/dvr_edge_router.py", line 236, in delete
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent super(DvrEdgeRouter, self).delete()
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/ha_router.py", line 492, in delete
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent self.destroy_state_change_monitor(self.process_monitor)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/l3/ha_router.py", line 438, in destroy_state_change_monitor
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent pm.disable(sig=str(int(signal.SIGTERM)))
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/linux/external_process.py", line 113, in disable
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent utils.execute(cmd, run_as_root=self.run_as_root)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent File "/usr/lib/python3/dist-packages/neutron/agent/linux/utils.py", line 147, in execute
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent raise exceptions.ProcessExecutionError(msg,
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent neutron_lib.exceptions.ProcessExecutionError: Exit code: 99; Stdin: ; Stdout: ; Stderr: /usr/bin/neutron-rootwrap: Unauthorized command: kill -15 2516433 (no filter matched)

Which results in the router being deleted from neutron but not the node. In my case i had both a qrouter and snat ns left with IPs still configured as well as my fip ip rule allocation still present in /var/lib/neutron/fip-priorities

[1] https://github.com/openstack/neutron/commit/4fb505891ee32ae41247f1d7a48b7455b342840e

[Regression Potential]
This change is backported from the stable/victoria release to authorize cleaning up of keepalived-state-chane via rootwrap [1] when running under python3.8. Where things can go wrong with l3.filters would be in the form of filter mistakes that allow or disallow running the intended command. In this case the code is picked straight from what is in stable/victoria and above and has already been tested by Ed, so it appears to have very low regression potential.
[1] https://wiki.openstack.org/wiki/Rootwrap

Changed in neutron:
status: New → In Progress
assignee: nobody → Edward Hope-Morley (hopem)
Changed in neutron (Ubuntu):
status: New → Invalid
Changed in neutron (Ubuntu Focal):
status: New → Triaged
importance: Undecided → High
Changed in cloud-archive:
status: New → Invalid
Revision history for this message
Edward Hope-Morley (hopem) wrote :
Changed in neutron:
importance: Undecided → High
Revision history for this message
Corey Bryant (corey.bryant) wrote :

Uploaded to focal unapproved queue.

description: updated
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Edward, or anyone else affected,

Accepted neutron into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/neutron/2:16.3.2-0ubuntu2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in neutron (Ubuntu Focal):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-focal
Revision history for this message
Corey Bryant (corey.bryant) wrote :

Hello Edward, or anyone else affected,

Accepted neutron into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:ussuri-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-ussuri-needed
Revision history for this message
Edward Hope-Morley (hopem) wrote :

focal-proposed verified using [Test Plan] and with the following output:

# apt-cache policy neutron-common
neutron-common:
  Installed: 2:16.3.2-0ubuntu2
  Candidate: 2:16.3.2-0ubuntu2
  Version table:
 *** 2:16.3.2-0ubuntu2 500
        500 http://nova.clouds.archive.ubuntu.com/ubuntu focal-proposed/main amd64 Packages
        100 /var/lib/dpkg/status
     2:16.3.1-0ubuntu1.1 500
        500 http://nova.clouds.archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages
     2:16.0.0~b3~git2020041516.5f42488a9a-0ubuntu2 500
        500 http://nova.clouds.archive.ubuntu.com/ubuntu focal/main amd64 Packages

$ grep kill_keepalived_monitor_py38 /etc/neutron/rootwrap.d/l3.filters
kill_keepalived_monitor_py38: KillFilter, root, python3.8, -15, -9

description: updated
description: updated
tags: added: verification-done-focal
removed: verification-needed-focal
Revision history for this message
Edward Hope-Morley (hopem) wrote :

bionic-ussuri-proposed verified using [Test Plan] and with the following output:

root@juju-9c4cdb-lp1929832-verify-6:~# apt-cache policy neutron-common
neutron-common:
  Installed: 2:16.3.2-0ubuntu2~cloud0
  Candidate: 2:16.3.2-0ubuntu2~cloud0
  Version table:
 *** 2:16.3.2-0ubuntu2~cloud0 500
        500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-proposed/ussuri/main amd64 Packages
        100 /var/lib/dpkg/status
     2:12.1.1-0ubuntu7 500
        500 http://nova.clouds.archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages
     2:12.0.1-0ubuntu1 500
        500 http://nova.clouds.archive.ubuntu.com/ubuntu bionic/main amd64 Packages

root@juju-9c4cdb-lp1929832-verify-6:~# grep py38 /etc/neutron/rootwrap.d/l3.filters
kill_keepalived_monitor_py38: KillFilter, root, python3.8, -15, -9

tags: added: verification-done verification-ussuri-done
removed: verification-needed verification-ussuri-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package neutron - 2:16.3.2-0ubuntu2

---------------
neutron (2:16.3.2-0ubuntu2) focal; urgency=medium

  * d/p/updates-for-python3.8.patch: Cherry-picked from
    https://review.opendev.org/c/openstack/neutron/+/793417
    to ensure py38 keepalived-state-change cleanup (LP: #1929832).
  * d/p/d/p/0001-L3-Check-agent-gateway-port-robustly.patch: Dropped.
    Fixed in 16.3.2 stable point release.

neutron (2:16.3.2-0ubuntu1) focal; urgency=medium

  * New stable point release for OpenStack Ussuri (LP: #1927976).

 -- Corey Bryant <email address hidden> Thu, 27 May 2021 15:54:50 -0400

Changed in neutron (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for neutron has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Edward Hope-Morley (hopem) wrote :

This has been released to the ussuri cloud archive (which is currently on 2:16.3.2-0ubuntu3~cloud0) so marking Fix Released.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/793417
Committed: https://opendev.org/openstack/neutron/commit/a6fdf35027749ebf8388bd4cd645f1ff06c3891a
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit a6fdf35027749ebf8388bd4cd645f1ff06c3891a
Author: Edward Hope-Morley <email address hidden>
Date: Thu May 27 17:31:43 2021 +0100

    Updates for python3.8

    Added a keepalived py38 KillFilter line to match the py36
    and py37 ones.

    NOTE: this is a partial backport of commit 4fb5058.

    Closes-Bug: #1929832
    Change-Id: Ief793b54d53c3239cfb24278e88e4f4189bbc2c2

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 16.4.0

This issue was fixed in the openstack/neutron 16.4.0 release.

Changed in neutron:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.