get_local_service_ip conflicts across multiple astara nodes

Bug #1524068 reported by Adam Gandelman on 2015-12-08
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Astara
Critical
Adam Gandelman
Liberty
Undecided
Unassigned

Bug Description

When clustering the orchestrator across multiple nodes, each node attempts to create a neutron port and plug a local interface. The current code hard-codes the address used for this port as the first address on the management network. When a second node comes online and attempts to create its port, the port creation will fail with:

 AddressInUseClient: Unable to complete operation for network bc050e3d-39f3-4012-9593-7d4e19325b72. The IP address fdca:3ba5:a17a:acda::1 is in use.

We need to make the port creation more intelligent to ensure each orchestrator node is attempting to create a port for an unused address.

Changed in astara:
importance: Undecided → Critical

Fix proposed to branch: master
Review: https://review.openstack.org/254998

Changed in astara:
status: New → In Progress

Fix proposed to branch: master
Review: https://review.openstack.org/255612

Reviewed: https://review.openstack.org/254998
Committed: https://git.openstack.org/cgit/openstack/astara/commit/?id=eee3aed5116be285b1ce2e8cdf2520504cbfcf3a
Submitter: Jenkins
Branch: master

commit eee3aed5116be285b1ce2e8cdf2520504cbfcf3a
Author: Adam Gandelman <email address hidden>
Date: Tue Dec 8 15:07:13 2015 -0800

    Dynamically allocate service port addresses

    We currently hard-code the address for management and external
    ports to the first address on the subnet, which breaks clustering
    astara-orchestrators when two nodes attempt to create neutron ports
    with the same addresses. This updates usage to instead rely on
    a Neutron-assigned address to bring up locally.

    Note this is a partial fix: we'll need corresponding changes that
    allow us to push in this address to appliances for metadata access,
    which is hard-coded there as well to the same address.

    Change-Id: I88fa97bae84ca245afa5ad0da4ac3c0bc1c441ff
    Partial-bug: #1524068

Changed in astara:
milestone: none → mitaka-2

Reviewed: https://review.openstack.org/263943
Committed: https://git.openstack.org/cgit/openstack/astara/commit/?id=1ba014ab2d0407d8f009b052ca803a9cc1fa565a
Submitter: Jenkins
Branch: stable/liberty

commit 1ba014ab2d0407d8f009b052ca803a9cc1fa565a
Author: Adam Gandelman <email address hidden>
Date: Tue Dec 8 15:07:13 2015 -0800

    Dynamically allocate service port addresses

    We currently hard-code the address for management and external
    ports to the first address on the subnet, which breaks clustering
    astara-orchestrators when two nodes attempt to create neutron ports
    with the same addresses. This updates usage to instead rely on
    a Neutron-assigned address to bring up locally.

    Note this is a partial fix: we'll need corresponding changes that
    allow us to push in this address to appliances for metadata access,
    which is hard-coded there as well to the same address.

    Change-Id: I88fa97bae84ca245afa5ad0da4ac3c0bc1c441ff
    Partial-bug: #1524068
    (cherry picked from commit eee3aed5116be285b1ce2e8cdf2520504cbfcf3a)

tags: added: in-stable-liberty

Reviewed: https://review.openstack.org/255613
Committed: https://git.openstack.org/cgit/openstack/astara-appliance/commit/?id=44610ac1cd3dca771307aba4bd3c24ed01a4acc1
Submitter: Jenkins
Branch: master

commit 44610ac1cd3dca771307aba4bd3c24ed01a4acc1
Author: Adam Gandelman <email address hidden>
Date: Wed Dec 9 16:45:36 2015 -0800

    Accept new orchestrator config bucket

    This adds the ability for the orchestrator to add a new bucket
    into the config dict keyed 'orchestrator', which can be used to
    notify the appliance of the specifics about the orchestrator currently
    managing it. Initially this will be used to inform the appliance where
    the metadata service is running, but in the future could be extended
    to do more, specifically around coordination.

    Change-Id: I4a4009f12ce025d3dc6577d27f877aeb8427b963
    Partial-bug: #1524068

Reviewed: https://review.openstack.org/255612
Committed: https://git.openstack.org/cgit/openstack/astara/commit/?id=568ea90f5a52331c64f81302303ced72143e50a1
Submitter: Jenkins
Branch: master

commit 568ea90f5a52331c64f81302303ced72143e50a1
Author: Adam Gandelman <email address hidden>
Date: Wed Dec 9 16:48:15 2015 -0800

    Push orchestrator config into the appliance

    This pushes a couple of flags into the appliance that are specific to the
    individual orchestrator instance managing that appliance. Initially, we use
    it to tell the appliance where the metadata proxy is listening. Previously,
    this was hard-coded to a known address on the network. With multiple
    orchestrators in a clustered env, this will allow each to run their own
    metadata proxy and have only their managed appliances querying that.

    Another patch will follow that will ensure this is up to date when rebalances
    occur and orchestrators take over new appliances.

    Change-Id: Ib502507b29f17146da81f61f34957cd96a1548f4
    Partial-bug: #1524068

Reviewed: https://review.openstack.org/259216
Committed: https://git.openstack.org/cgit/openstack/astara/commit/?id=f2360d861f3904c8a06d94175be553fe5e7bab05
Submitter: Jenkins
Branch: master

commit f2360d861f3904c8a06d94175be553fe5e7bab05
Author: Adam Gandelman <email address hidden>
Date: Thu Dec 17 15:16:35 2015 -0800

    Cleanup SM management during rebalance events.

    This cleans up the worker's handling of rebalance events a bit
    and ensures we dont drop state machines in a way that prevents
    them from later being recreated. It also avoids a bug where, upon
    failing over resources to a new orchestartor, we create a state
    machine per worker, instead of dispatching them to one single worker.

    To do this, the scheduler is passed into workers as well as the
    process name, allowing them to more intelligently figure out what
    they need to manage after a cluster event.

    Finally, this ensures a config update is issued to appliances after
    they have moved to a new orchestrator after a cluster event.

    Change-Id: I76bf702c33ac6ff831270e7185a6aa3fc4c464ca
    Partial-bug: #1524068
    Closes-bug: #1527396

Changed in astara:
assignee: nobody → Adam Gandelman (gandelman-a)
status: In Progress → Fix Committed
Changed in astara:
status: Fix Committed → Fix Released

Change abandoned by Adam Gandelman (<email address hidden>) on branch: master
Review: https://review.openstack.org/260748

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers