Bad initialization sequence in Neutron agents (and maybe somewhere else)

Bug #1594439 reported by Yuriy Taraday
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Won't Fix
Medium
Unassigned

Bug Description

TL;DR: running threads that modify global state (which can be any RPC calls, or any other library calls) in background while forking happens may cause weird hard-to-debug errors, so background jobs should be initialized after forking (or even better in separate process)

Currently at least metadata and l3 agents start background threads that periodically report state before forking and running main loop in child threads. Those threads can modify global state so that it is inconsistent in the time of forking which will lead to bad service behavior. In the ideal world main process shouldn't do anything except managing child processes to avoid any global state leakage into child processes (open fds, locked locks, etc).

This bug was uncovered during investigation into weird seemingly unrelated error in grenade job in Ann's CR that was changing to new oslo.db EngineFacade [0]. Symptoms were: SSHTimeout while instance's cloud-init is busy trying to get metadata and failing. In q-meta logs we noticed that there's no INFO messages about incoming HTTP requests, but there are some DEBUG ones, which means that requests are coming to the agent but are not being responded to. Digging deeper we noticed that in the normal operation metadata agent should:

- receive request;
- do RPC call to neutron-server;
- do HTTP request to Nova;
- send response.

There were no RPC "CALL" in logs, only CASTs from state reporting and the very first one CALL for state reporting.

Since it's hard to reproduce gate jobs, especially multinode ones, we've created another CR [1] that added tracing of every Python LOC to see what really happens. You can find the long log with all the tracing at [2] or in attachment (logs don't live forever). It uncovered the following chain of events:

- main thread in main process starts background thread for state reporting;
- some time later that thread starts reporting, wants to do the first CALL (it does CALL once and then it does CASTs);
- to get 'reply_q' (essentially, connection shared for replies IIUC), it acquires a lock;
- since there are no connections available at that time (it's the first time RPC is used), oslo_messaging starts connecting to RabbitMQ;
- background thread yield execution to main thread;
- main thread forks bunch of WSGI workers;
- in WSGI workers when requests come, handler tries to do CALL to neutron-server;
- to get 'reply_q' it tries to acquire a lock, but it is already has been "taken" by background thread in the main process;
- it hangs forever, which can be seen in Guru Meditation Report.

There are several problems here (including eventlet having not fork-aware locks), but the one that can be fixed to fix them all is to start such background threads after all forking happens. I've published CR [3] to verify that changing initialization order will fix this issue, and it did.

Note that from what I've been told, forks can still happen if child worker unexpectedly dies and main process reforks it. To properly fix this issue we should not do anything in the main process that can spoil global state that can influence child process. It means that we'll need to either run state reporting in a separate process or have an isolated oslo_messaging environment (context? IANA expert in oslo_messaging) for it.

[0] https://review.openstack.org/312393
[1] https://review.openstack.org/331485
[2] http://logs.openstack.org/85/331485/5/check/gate-grenade-dsvm-neutron-multinode/1c65056/logs/new/screen-q-meta.txt.gz
[3] https://review.openstack.org/331672

Revision history for this message
Yuriy Taraday (yorik-sar) wrote :
Changed in neutron:
assignee: nobody → Yuriy Taraday (yorik-sar)
status: New → In Progress
Changed in neutron:
assignee: Yuriy Taraday (yorik-sar) → Ann Kamyshnikova (akamyshnikova)
Changed in neutron:
assignee: Ann Kamyshnikova (akamyshnikova) → Yuriy Taraday (yorik-sar)
tags: added: needs-attention
Changed in neutron:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/331672
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=7eeeb99a55385b5fc3d4ce1a3b390e41be252b17
Submitter: Jenkins
Branch: master

commit 7eeeb99a55385b5fc3d4ce1a3b390e41be252b17
Author: Yuriy Taraday <email address hidden>
Date: Mon Jun 20 16:39:03 2016 +0300

    Move state reporting initialization to after worker forking

    We start state reporting thread before forking child processes. This
    leads to possibility of fork in the middle of reporting process. This
    can cause child processes to start with corrupted global state. In [0]
    it represented itself as metadata agent not being able to make RPC call
    because of lock acquired in main process before fork and released after
    fork. See bug description for details.

    [0] https://review.openstack.org/312393

    Partial-Bug: 1594439

    Change-Id: Id5079a296bc402c6f5b6cdb2df72811eab5bc6ed

Changed in neutron:
assignee: Yuriy Taraday (yorik-sar) → nobody
status: In Progress → Confirmed
tags: added: neutron-proactive-backport-potential
Changed in neutron:
assignee: nobody → Sindhu Devale (sindhu-devale-3)
Changed in neutron:
assignee: Sindhu Devale (sindhu-devale-3) → nobody
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/347394

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/mitaka)

Reviewed: https://review.openstack.org/347394
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=ae6a2ec2a560105c0300c07c72da3c82dce6b9f6
Submitter: Jenkins
Branch: stable/mitaka

commit ae6a2ec2a560105c0300c07c72da3c82dce6b9f6
Author: Yuriy Taraday <email address hidden>
Date: Mon Jun 20 16:39:03 2016 +0300

    Move state reporting initialization to after worker forking

    We start state reporting thread before forking child processes. This
    leads to possibility of fork in the middle of reporting process. This
    can cause child processes to start with corrupted global state. In [0]
    it represented itself as metadata agent not being able to make RPC call
    because of lock acquired in main process before fork and released after
    fork. See bug description for details.

    [0] https://review.openstack.org/312393

    Partial-Bug: 1594439

    Change-Id: Id5079a296bc402c6f5b6cdb2df72811eab5bc6ed
    (cherry picked from commit 7eeeb99a55385b5fc3d4ce1a3b390e41be252b17)

tags: added: in-stable-mitaka
tags: removed: neutron-proactive-backport-potential
tags: removed: needs-attention
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Bug closed due to lack of activity, please feel free to reopen if needed.

Changed in neutron:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.