neutron-keepalived-state-change doesnt reap zombies if rootwrap-daemon is used

Bug #1822591 reported by Fabian Zimmermann
This bug affects 6 people
Affects Status Importance Assigned to Milestone

Bug Description

I just tested the current stable/pike-branch of neutron.

After some time our monitoring alerted us about increasing amount of zombie-processes. After taking a look, I found

commit 91c26f56586e88a75e942cd06c8f4539acfb4963

which seems to introduces some rootwrap-helper changes. I reverted the commit but the problem still remains, so maybe the commit is not the reason, but triggers the problem more likely.

Steps to reproduce:

* configure neutron-rootwrap-daemon
* start l3-agents
* wait for (ha-)routers to be setup
* kill a rootwrap-daemon
* sudo is now a zombie which is not reapd by neutron-keepalived-state-change.

Tags: l3-ha
Revision history for this message
Bernard Cafarelli (bcafarel) wrote :

I can reproduce this on current devstack:
stack 20640 2.1 0.6 466476 108520 ? Ss 10:23 7:32 /usr/bin/python /usr/bin/neutron-dhcp-agent --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/dhcp_agent.ini
root 23941 0.0 0.0 272332 5632 ? S 10:23 0:00 \_ sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf
root 23944 0.0 0.1 467628 22496 ? Sl 10:23 0:01 \_ /usr/bin/python /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf

After killing process 23944:

stack 20640 2.1 0.6 466476 108520 ? Ss 10:23 7:32 /usr/bin/python /usr/bin/neutron-dhcp-agent --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/dhcp_agent.ini
root 23941 0.0 0.0 0 0 ? Z 10:23 0:00 \_ [sudo] <defunct>

Was this working before? Thinking at when it may got broken

Changed in neutron:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Fabian Zimmermann (dev-faz) wrote :

I switched back from stable/pike-branch to tag 11.0.6 and Im no longer having any zombie-processes after running the complete tempest-test-suite.

Im still having the Zombies if I kill the neutron-rootwrap-daemon manually, but I think the daemon normally shouldnt terminate.

So in my opinion the question is: Why is neutron-rootwrap-daemon terminating (more often) in versions >11.0.6?

Miguel Lavalle (minsel)
tags: added: l3-ha
tags: added: l3-dvr-backlog
tags: removed: l3-dvr-backlog
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers