L3 advanced services singleton pattern messing up L3 agent functional testing

Bug #1425759 reported by Assaf Muller
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Undecided
Assaf Muller

Bug Description

The idea behind the AdvancedServices (Metadata, *aaS) services to be singletons was to (1) offer a way to get an instance of such a service in a convenient manner, and to (2) ensure that only one service registered per service type.

(1) Since the AdvancedService.instance method required an instance of a L3 agent, this point was kind of missed. If you have a L3 agent instance already in your hand, just use it to access the service you're looking for.

(2) This is now fulfilled by asserting that only one service is registered (per type) in the .add method.

The motivation to make them non-singletons is that:
a) They don't need to be
b) The code is simplified this way
c) I've been facing crazy issues during functional testing where we're constantly instantiating L3 agents, and the (singleton) services were referencing the wrong configuration object. Basically the service was re-initialized during a test, screwing up the configuration of other mid-run tests.

Bug triage (Some of this is my *guess* as to what's wrong, other parts are facts):
tox -e dsvm-functional neutron.tests.functional.agent.test_l3_agent by default spawns a test runner per CPU on the machine it is running on. So, for 4 CPUs, 4 workers that work in parallel. Each worker (a process) shares memory space so that while each test registers its own agent, they all share the same AdvancedServices (Specifically the metadata driver advanced service). The metadata advanced service has a reference back to the configuration of the L3 agent that started it (Namely for the different directory paths like PIDs). The first test initializes the (single) metadata advanced service instance and sets up its configuration to that of the test's L3 agent. If it finishes before the next test initializes its agent, fantastic, as (With a horrible hack by myself: https://review.openstack.org/#/c/143300/2/neutron/tests/common/agents/l3_agent.py) each initialization of the test L3 agent re-creates all of the registered AdvancedServices, thereby pointing them to that new agent. However, if this happens *during* the run of another test (In the same worker; I don't know how this could happen as tests seem to run serially in a single worker) Bad Things Happen (TM). Specifically, the PIDs directory path is now different than when it was when the metadata proxy was spawned and so when you assert that the metadata proxy is up, it's looking at the wrong path for PIDs that don't exist (But do exist elsewhere on the disk) and concludes that no, the metadata proxy is not up. What is absolutely true is that I observed the metadata advanced driver starting the proxy under one directory path and then the test looking if its up in another directory path. The solution to this mess is have the AdvancedServices not be singleton. Each agent instantiates its AdvancedServices and keeps references to them.

Assaf Muller (amuller)
Changed in neutron:
status: New → In Progress
assignee: nobody → Assaf Muller (amuller)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-vpnaas (master)

Reviewed: https://review.openstack.org/158487
Committed: https://git.openstack.org/cgit/openstack/neutron-vpnaas/commit/?id=67c983b9b38afceed35ed901a1d35aae7a7048f4
Submitter: Jenkins
Branch: master

commit 67c983b9b38afceed35ed901a1d35aae7a7048f4
Author: Assaf Muller <email address hidden>
Date: Mon Feb 23 18:20:06 2015 -0500

    Change L3 agent AdvancedService class to be non-singleton

    The idea behind the AdvancedServices (Metadata, *aaS) services
    to be singletons was to (1) offer a way to get an instance of such
    a service in a convinient manner, and to (2) ensure that only one
    service registered per service type.

    (1) Since the AdvancedService.instance method required an instance
    of a L3 agent, this point was kind of missed. If you have a L3 agent
    instance in your hand, just use it to access the service you're looking
    for.

    (2) This is now fulfilled by asserting that only one driver is registered
    (per type) in the .add method.

    The motivation to make them non-singletons is that:
    a) They don't need to be
    b) The code is simplified this way
    c) I've been facing crazy issues during functional testing where we're
       constantly instantiating L3 agents, and the (singleton) metadata
       service was referencing the wrong configuration object. The service
       was re-initialized during a test, screwing up the configuration of
       other mid-run tests.

    Related-Bug: #1425759
    Change-Id: I199f6a967c64e921624b397fa5b7dfc56572d63a

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-fwaas (master)

Reviewed: https://review.openstack.org/158486
Committed: https://git.openstack.org/cgit/openstack/neutron-fwaas/commit/?id=93b1f892f83d7e882c7f0fbfa7cb9e68679b76e5
Submitter: Jenkins
Branch: master

commit 93b1f892f83d7e882c7f0fbfa7cb9e68679b76e5
Author: Assaf Muller <email address hidden>
Date: Mon Feb 23 18:11:18 2015 -0500

    Change L3 agent AdvancedService class to be non-singleton

    The idea behind the AdvancedServices (Metadata, *aaS) services
    to be singletons was to (1) offer a way to get an instance of such
    a service in a convinient manner, and to (2) ensure that only one
    service registered per service type.

    (1) Since the AdvancedService.instance method required an instance
    of a L3 agent, this point was kind of missed. If you have a L3 agent
    instance in your hand, just use it to access the service you're looking
    for.

    (2) This is now fulfilled by asserting that only one driver is registered
    (per type) in the .add method.

    The motivation to make them non-singletons is that:
    a) They don't need to be
    b) The code is simplified this way
    c) I've been facing crazy issues during functional testing where we're
       constantly instantiating L3 agents, and the (singleton) metadata
       service was referencing the wrong configuration object. The service
       was re-initialized during a test, screwing up the configuration of
       other mid-run tests.

    Related-Bug: #1425759
    Change-Id: I155aece9b7028f1bdd3d9709facef392b38aec27

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/158468
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=911f8b57f80798ec8fe3c82282fb4c812cc9472c
Submitter: Jenkins
Branch: master

commit 911f8b57f80798ec8fe3c82282fb4c812cc9472c
Author: Assaf Muller <email address hidden>
Date: Mon Feb 23 17:07:30 2015 -0500

    Change L3 agent AdvancedService class to be non-singleton

    The idea behind the AdvancedServices (Metadata, *aaS) services
    to be singletons was to (1) offer a way to get an instance of such
    a service in a convinient manner, and to (2) ensure that only one
    service registered per service type.

    (1) Since the AdvancedService.instance method required an instance
    of a L3 agent, this point was kind of missed. If you have a L3 agent
    instance in your hand, just use it to access the service you're looking
    for.

    (2) This is now fulfilled by asserting that only one driver is registered
    (per type) in the .add method.

    The motivation to make them non-singletons is that:
    a) They don't need to be
    b) The code is simplified this way
    c) I've been facing crazy issues during functional testing where we're
       constantly instantiating L3 agents, and the (singleton) metadata
       service was referencing the wrong configuration object. The service
       was re-initialized during a test, screwing up the configuration of
       other mid-run tests.

    Closes-Bug: #1425759
    Depends-On: I155aece9b7028f1bdd3d9709facef392b38aec27
    Depends-On: I199f6a967c64e921624b397fa5b7dfc56572d63a
    Change-Id: I192e4d12380346c53c8ae0246371ccc41a836c4c

Changed in neutron:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in neutron:
milestone: none → kilo-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: kilo-3 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.