Plugins have no way to do post-api/rpc worker fork initialization

Bug #1463129 reported by Terry Wilson
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Terry Wilson

Bug Description

There are several classes of plugin initialization that need to be handled *after* neutron server forks child processes for API/RPC workers. For example, any client sockets that are set up for SDN controllers pre-fork are broken in child processes since the file descriptors are copied over, but both parent and child processes may try to read/write to this fd. Also, if a thread is initiated pre-fork to handle any polling or connection handling, child processes will not have this thread after fork.

As an example, the networking-ovn project completely deadlocks when api/rpc workers != 0.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/189391

Changed in neutron:
assignee: nobody → Terry Wilson (otherwiseguy)
status: New → In Progress
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :
Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

We should probably just sit down and define a strategy for booting the bloody servers.
At this stage things happen pretty much randomly. For instance the quota engine is initialized because some module imports it.
If that changes in the future, quota enforcement will suddenly stop working. Similarly for the authZ policy.

And then the initialisation of the RPC handlers, and the WSGI server, divided in core server and extensions, is pure poetry.

At the end of the day we have a sort of "chaotic" bootstrap system, which somehow works but is in a rather unstable algorithm. While I do like chaotic systems, in this case a plain old system where stuff gets initialised one component at a time works better.

Revision history for this message
Miguel Angel Ajo (mangelajo) wrote :

Salvatore's suggestion seems quite reasonable to me. :)

Kyle Mestery (mestery)
Changed in neutron:
importance: Undecided → Medium
Changed in neutron:
assignee: Terry Wilson (otherwiseguy) → Carl Baldwin (carl-baldwin)
Revision history for this message
Gal Sagie (gal-sagie) wrote :

Once fixed networking-ovn devstack plugin needs to be changed to support more API WORKERS,
check review https://review.openstack.org/#/c/216837/

Changed in neutron:
assignee: Carl Baldwin (carl-baldwin) → Terry Wilson (otherwiseguy)
Changed in neutron:
milestone: none → liberty-3
Changed in neutron:
importance: Medium → High
Changed in neutron:
assignee: Terry Wilson (otherwiseguy) → Armando Migliaccio (armando-migliaccio)
Changed in neutron:
assignee: Armando Migliaccio (armando-migliaccio) → Terry Wilson (otherwiseguy)
Changed in neutron:
milestone: liberty-3 → liberty-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/189391
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=9f6bd17703b7286be9e7d439d15f4dec2774e13a
Submitter: Jenkins
Branch: master

commit 9f6bd17703b7286be9e7d439d15f4dec2774e13a
Author: Terry Wilson <email address hidden>
Date: Mon Jun 15 22:52:28 2015 -0500

    Add support for PluginWorker and Process creation notification

    There are several cases where plugin initialization should be
    handled after neutron-server forks API/RPC workers. For example,
    starting a client connection to an SDN controller before forking
    copies the fd of the socket to the child process, but then you have
    multiple processes trying to read/write the same socket connection.

    It is also useful for a plugin to be able to do something in only
    one process, regardless of how many workers are forked. One example
    would be handling syncing from an external system to the neutron
    database.

    This patch does 3 things:
    1) Treats rpc_workers=0 as = 1. This simplifies the code for
       handling notification that forking has completed. In the
       existing code, calling the notification in the Worker object's
       start() method would happen twice in the case where both api
       and rpc workers were 0, despite there being only one process.
       An earlier patch already changed the default api_workers to be
       the number of processors.
    2) Adds notification of forking via the callbacks mechanism.
       Plugins can subscribe to resources.PROCESS, event.AFTER_CREATE
       and do any post-fork initialization that needs to be done for
       every spawned process.
    3) Adds core/service plugin calls to get_workers() which defaults
       to returning (). Plugins that need additional processes to spawn
       should just return an iterable of NeutronWorkers that will be
       spawned in their own process.

    DocImpact

    Closes-Bug: #1463129
    Change-Id: Ib99954678c2b4f32f486b537979d446aafbea07b

Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/pecan)

Fix proposed to branch: feature/pecan
Review: https://review.openstack.org/224334

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: feature/pecan
Review: https://review.openstack.org/224357

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (feature/pecan)
Download full text (73.6 KiB)

Reviewed: https://review.openstack.org/224357
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=fdc3431ccd219accf6a795079d9b67b8656eed8e
Submitter: Jenkins
Branch: feature/pecan

commit fe236bdaadb949661a0bfb9b62ddbe432b4cf5f1
Author: Miguel Angel Ajo <email address hidden>
Date: Thu Sep 3 15:40:12 2015 +0200

    No network devices on network attached qos policies

    Network devices, like internal router legs, or dhcp ports
    should not be affected by bandwidth limiting rules.

    This patch disables application of network attached policies
    to network/neutron owned ports.

    Closes-bug: #1486039
    DocImpact

    Change-Id: I75d80227f1e6c4b3f5fa7762b8dc3b0c0f1abd46

commit db4a06f7caa20a4c7879b58b20e95b223ed8eeaf
Author: Ken'ichi Ohmichi <email address hidden>
Date: Wed Sep 16 10:04:32 2015 +0000

    Use tempest-lib's token_client

    Now tempest-lib provides token_client modules as library and the
    interface is stable. So neutron repogitory doesn't need to contain
    these modules.
    This patch makes neutron use tempest-lib's token_client and removes
    the own modules for the maintenance.

    Change-Id: Ieff7eb003f6e8257d83368dbc80e332aa66a156c

commit 78aed58edbe6eb8a71339c7add491fe9de9a0546
Author: Jakub Libosvar <email address hidden>
Date: Thu Aug 13 09:08:20 2015 +0000

    Fix establishing UDP connection

    Previously, in establish_connection() for UDP protocol data were sent
    but never read on peer socket. That lead to successful read on peer side
    if this connection was filtered. Having constant testing string masked
    this issue as we can't distinguish to which test of connectivity data
    belong.

    This patch makes unique data string per test_connectivity() and
    also makes establish_connection() to create an ASSURED entry in
    conntrack table. Finally, in last test after firewall filter was
    removed, connection is re-established in order to avoid troubles with
    terminated processes or TCP continuing sending packets which weren't
    successfully delivered.

    Closes-Bug: 1478847
    Change-Id: I2920d587d8df8d96dc1c752c28f48ba495f3cf0f

commit e6292fcdd6262434a7b713ad8802db6bc8a6d3dc
Author: YAMAMOTO Takashi <email address hidden>
Date: Wed Sep 16 13:20:51 2015 +0900

    ovsdb: Fix a few docstring

    Change-Id: I53e1e21655b28fe5da60e58aeeb7cbbd103ae014

commit c22949a4449d96a67caa616290cf76b67b182917
Author: fumihiko kakuma <email address hidden>
Date: Wed Sep 16 11:52:59 2015 +0900

    Remove requirements.txt for the ofagent mechanism driver

    It is no longer used.

    Related-Blueprint: core-vendor-decomposition
    https://blueprints.launchpad.net/neutron/+spec/core-vendor-decomposition

    Change-Id: Ib31fb3febf8968e50d86dd66e1e6e1ea2313f8ac

commit d1d4de19d85f961d388c91e70f31b3bafec418c5
Author: Kevin Benton <email address hidden>
Date: Thu Sep 3 20:25:57 2015 -0700

    Always return iterables in L3 get_candidates

    The caller of this function expects iterables.

    Closes-Bug: #1494996
    Change-Id: I3d103e63f4e127a77268502415c0ddb0d804b54a

commit 1ad6ac448067306...

tags: added: in-feature-pecan
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (feature/pecan)

Change abandoned by Doug Wiegley (<email address hidden>) on branch: feature/pecan
Review: https://review.openstack.org/224334

Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: liberty-rc1 → 7.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.