os_neutron ns-metadata-proxy cleanup error attempting to kill PID

Bug #1627185 reported by Evan Callicoat
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Fix Released
Undecided
Kevin Carter

Bug Description

When running the ns-metadata-proxy cleanup tasks, we get the following error sometimes:

13:35:48 failed: [jrpcaioiad-3a8_neutron_agents_container-101ec306] => {"changed": true, "cmd": "for ns_pid in $(pgrep neutron-ns-meta); do\n echo $(readlink -f \"/proc/$ns_pid/exe\") | grep -qv \"13.3.4\"\n if [ $? -eq 0 ]; then\n (echo \"old metadata proxy pid found running clean up on $ns_pid\"; kill -9 \"$ns_pid\")\n fi\n done", "delta": "0:00:00.049977", "end": "2016-09-23 13:35:48.950575", "rc": 1, "start": "2016-09-23 13:35:48.900598", "warnings": []}
13:35:48 stderr: /bin/sh: 4: kill: No such process
13:35:48
13:35:48 /bin/sh: 4: kill: No such process
13:35:48 stdout: old metadata proxy pid found running clean up on 1337
13:35:48 old metadata proxy pid found running clean up on 2781

It appears that the PIDs found during the pgrep are sometimes lost before the kill. Either that or the filtering has some subtle bug resulting in an output that's not a valid PID or PIDs.

I suggest that we instead reverse the order of tasks in this handler, replacing the complex pgrep/grep/kill logic with a simple pkill of all the proxies, followed by the service restart. I'll submit a patch for consideration soon.

Revision history for this message
Jean-Philippe Evrard (jean-philippe-evrard) wrote :

In which version does this happen? Latest mitaka I guess?

Could you confirm the process were running fine before the neutron playbook run?

Changed in openstack-ansible:
assignee: nobody → Jean-Philippe Evrard (jean-philippe-evrard)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible-os_neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/377924

Changed in openstack-ansible:
assignee: Jean-Philippe Evrard (jean-philippe-evrard) → Kevin Carter (kevin-carter)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible-os_neutron (master)

Reviewed: https://review.openstack.org/377924
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-os_neutron/commit/?id=cfb341a368b277d43b30da2e3b9a3431e1ad9c9a
Submitter: Jenkins
Branch: master

commit cfb341a368b277d43b30da2e3b9a3431e1ad9c9a
Author: Kevin Carter <email address hidden>
Date: Tue Sep 27 11:53:17 2016 -0500

    Add conditional around the pid clean up process

    The NS metadata proxy pid cleanup process hunts for and removes
    PIDs executing old code by using version tags. Under certain
    conditions it's possible for an old PID to have expired before
    the cleanup action has run. This change simply wraps the
    `pkill` command with a test to ensure the task isn't failing.
    Should a PID actually be cleaned up the task will print to stdout
    and log using the logger command.

    Closes-Bug: #1627185
    Change-Id: I8c012feb399f8ca65172e9404b859c8f6111de35
    Signed-off-by: Kevin Carter <email address hidden>

Changed in openstack-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible-os_neutron (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/378535

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible-os_neutron (stable/mitaka)

Reviewed: https://review.openstack.org/378535
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-os_neutron/commit/?id=ae7af49d1670f5b37e64a0f0bb1590e20557d980
Submitter: Jenkins
Branch: stable/mitaka

commit ae7af49d1670f5b37e64a0f0bb1590e20557d980
Author: Kevin Carter <email address hidden>
Date: Tue Sep 27 11:53:17 2016 -0500

    Add conditional around the pid clean up process

    The NS metadata proxy pid cleanup process hunts for and removes
    PIDs executing old code by using version tags. Under certain
    conditions it's possible for an old PID to have expired before
    the cleanup action has run. This change simply wraps the
    `pkill` command with a test to ensure the task isn't failing.
    Should a PID actually be cleaned up the task will print to stdout
    and log using the logger command.

    Closes-Bug: #1627185
    Change-Id: I8c012feb399f8ca65172e9404b859c8f6111de35
    Signed-off-by: Kevin Carter <email address hidden>
    (cherry picked from commit cfb341a368b277d43b30da2e3b9a3431e1ad9c9a)

tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible-os_neutron (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/380153

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible-os_neutron (stable/newton)

Reviewed: https://review.openstack.org/380153
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-os_neutron/commit/?id=be355476417219407443d8de9a6e974fe888604a
Submitter: Jenkins
Branch: stable/newton

commit be355476417219407443d8de9a6e974fe888604a
Author: Kevin Carter <email address hidden>
Date: Tue Sep 27 11:53:17 2016 -0500

    Add conditional around the pid clean up process

    The NS metadata proxy pid cleanup process hunts for and removes
    PIDs executing old code by using version tags. Under certain
    conditions it's possible for an old PID to have expired before
    the cleanup action has run. This change simply wraps the
    `pkill` command with a test to ensure the task isn't failing.
    Should a PID actually be cleaned up the task will print to stdout
    and log using the logger command.

    Closes-Bug: #1627185
    Change-Id: I8c012feb399f8ca65172e9404b859c8f6111de35
    Signed-off-by: Kevin Carter <email address hidden>
    (cherry picked from commit cfb341a368b277d43b30da2e3b9a3431e1ad9c9a)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-os_neutron 14.0.0.0rc3

This issue was fixed in the openstack/openstack-ansible-os_neutron 14.0.0.0rc3 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-os_neutron 13.3.5

This issue was fixed in the openstack/openstack-ansible-os_neutron 13.3.5 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-os_neutron 15.0.0.0b1

This issue was fixed in the openstack/openstack-ansible-os_neutron 15.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.