Comment 12 for bug 1895652

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

To continue further with #11.

Albeit it seemed like an unrelated issue, after extensive rpdb tracing I was able to find that the privsep daemon was hanging while trying to log a debug message on response:

neutron-openvswitch-agent -> request -> privsep daemon -> iproute command
                         hanged here <- privsep daemon <-

(Pdb) n
> /usr/lib/python3/dist-packages/oslo_privsep/daemon.py(474)_call_back()
-> LOG.debug('privsep: reply[%(msgid)s]: %(reply)s',
(Pdb) n
> /usr/lib/python3/dist-packages/oslo_privsep/daemon.py(475)_call_back()
-> {'msgid': msgid, 'reply': reply})
<hanged here>

In essence, any attempt to get to the content of the logger object was hanging as well in the debugger:
https://paste.ubuntu.com/p/2DFstDMCtP/

As a result, you could see that neutron-openvswitch-agent starts and that it begins to initialize its state. But then it simply does not get to any interaction with neutron-server using RPC. Narrowing it down shows that it hangs while waiting for a response from the privsep daemon and the privsep daemon hangs while trying to do debug logging.

I confirmed that agents are able to start and get a response from the privsep daemon by disabling debug logging (which obviously does not help in debugging the originally posted issue).