fullstack runs leave neutron-server forked processes running

Bug #1502987 reported by Assaf Muller
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Triaged
Low
Unassigned

Bug Description

Since by default api_workers = Number of CPUs on the machine, the neutron-server forks itself when it starts. Those child processes are left behind on both successful and failed fullstack runs. This is possibly related to bug https://bugs.launchpad.net/neutron/+bug/1487548.

Tags: fullstack
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Can this pose a stability issue to the gate job? Would consecutive runs interfere with one another on a local box?

Changed in neutron:
importance: Undecided → Low
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Is there a simple check that one can do to confirm the issue? Do you intend to work on it yourself? If not, it would be nice to provide more info so that the triaging would be faster.

Would the execution of:

tox -edsvm-fullstack ; ps -aux | grep neutron

suffice, to demonstrate the leaking?

Changed in neutron:
status: New → Incomplete
Revision history for this message
Assaf Muller (amuller) wrote :

> Can this pose a stability issue to the gate job?

I don't know yet. The fullstack job is extremely unstable right now, I've not determined why yet.

> Would consecutive runs interfere with one another on a local box?

Absolutely. Given enough runs you eventually run out of DB connections (Since the children neutron-server processes aren't killed, they maintain their DB connections) and all following test runs fail.

> Do you intend to work on it yourself?

Not at this time. I was hoping this bug report would illicit a discussion so that the correct solution could surface.

> If not, it would be nice to provide more info so that the triaging would be faster.

ps aux | grep neutron-server shows an empty list
tox -e dsvm-fullstack <path.to.a.single.test>
ps aux | grep neutron-server shows only children of parent neutron-server, parent was killed

We clean up resources (Including the neutron-server) with a kill -9. That leaves the children behind and they're never cleaned up. The issue is easily reproducible I'm just not sure what is the correct solution.

Changed in neutron:
status: Incomplete → Triaged
Revision history for this message
Jakub Libosvar (libosvar) wrote :

Isn't this related to the thing we still use SIGKILL to terminate neutron-server? This way orphaned child processes won't get a chance to be terminated by the parent neutron-server. Receiving SIGTERM in neutron-server will stop child processes.

Revision history for this message
Jakub Libosvar (libosvar) wrote :

I just read the last paragraph of Assaf's comment and now I feel dumb...I should read first and then write. I apologize.

Revision history for this message
Assaf Muller (amuller) wrote :

The other bug wasn't tagged with 'fullstack' so I missed it.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.