subnetpool allocation not working with postgresql

Bug #1451558 reported by Cedric Brandily
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Cedric Brandily
Kilo
New
Undecided
Unassigned

Bug Description

The following is working with mysql but not with postgresql

#$ neutron subnetpool-create pool --pool-prefix 10.0.0.0/8 --default-prefixlen 24
#$ neutron net-create net
#$ neutron subnet-create net --name subnet --subnetpool pool

The last command raises a 501 with postgresql with the stacktrace[2] in neutron-server, because _get_allocated_cidrs[1] performs a SELECT FOR UPDATE with a JOIN on an empty select! (allowed with mysql, not postgresql).

[1]: https://github.com/openstack/neutron/blob/5962d825a6c98225c51bc6dd304b5c1ac89035ef/neutron/ipam/subnet_alloc.py#L40-L44
  query = session.query(models_v2.Subnet).with_lockmode('update')
  subnets = query.filter_by(subnetpool_id=self._subnetpool['id'])

[2]: neutron-server stacktrace
2015-05-04 21:47:01.939 ERROR neutron.api.v2.resource [req-a6c14f61-bdb2-4273-a231-df0a85fb33d8 demo b532b7a9302c45b18f06f68b41869ffa] create failed
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource Traceback (most recent call last):
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/opt/stack/neutron/neutron/api/v2/resource.py", line 83, in resource
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource result = method(request=request, **args)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/opt/stack/neutron/neutron/api/v2/base.py", line 461, in create
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource obj = obj_creator(request.context, **kwargs)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/opt/stack/neutron/neutron/plugins/ml2/plugin.py", line 804, in create_subnet
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource result, mech_context = self._create_subnet_db(context, subnet)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/opt/stack/neutron/neutron/plugins/ml2/plugin.py", line 795, in _create_subnet_db
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource result = super(Ml2Plugin, self).create_subnet(context, subnet)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/opt/stack/neutron/neutron/db/db_base_plugin_v2.py", line 1389, in create_subnet
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource subnetpool_id)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/oslo_db/api.py", line 131, in wrapper
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource return f(*args, **kwargs)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/opt/stack/neutron/neutron/db/db_base_plugin_v2.py", line 1283, in _create_subnet_from_pool
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource ipam_subnet = allocator.allocate_subnet(context.session, req)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/opt/stack/neutron/neutron/ipam/subnet_alloc.py", line 141, in allocate_subnet
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource return self._allocate_any_subnet(session, request)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/opt/stack/neutron/neutron/ipam/subnet_alloc.py", line 93, in _allocate_any_subnet
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource prefix_pool = self._get_available_prefix_list(session)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/opt/stack/neutron/neutron/ipam/subnet_alloc.py", line 48, in _get_available_prefix_list
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource allocations = self._get_allocated_cidrs(session)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/opt/stack/neutron/neutron/ipam/subnet_alloc.py", line 44, in _get_allocated_cidrs
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource return (x.cidr for x in subnets)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2441, in __iter__
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource return self._execute_and_instances(context)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2456, in _execute_and_instances
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource result = conn.execute(querycontext.statement, self._params)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 841, in execute
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource return meth(self, multiparams, params)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/elements.py", line 322, in _execute_on_connection
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource return connection._execute_clauseelement(self, multiparams, params)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 938, in _execute_clauseelement
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource compiled_sql, distilled_params
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1070, in _execute_context
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource context)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/oslo_db/sqlalchemy/compat/handle_error.py", line 261, in _handle_dbapi_exception
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource e, statement, parameters, cursor, context)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1267, in _handle_dbapi_exception
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource util.raise_from_cause(newraise, exc_info)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource reraise(type(exception), exception, tb=exc_tb)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1063, in _execute_context
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource context)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 442, in do_execute
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource cursor.execute(statement, parameters)
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource DBError: (NotSupportedError) FOR UPDATE cannot be applied to the nullable side of an outer join
2015-05-04 21:47:01.939 TRACE neutron.api.v2.resource 'SELECT subnets.tenant_id AS subnets_tenant_id, subnets.id AS subnets_id, subnets.name AS subnets_name, subnets.network_id AS subnets_network_id, subnets.subnetpool_id AS subnets_subnetpool_id, subnets.ip_version AS subnets_ip_version, subnets.cidr AS subnets_cidr, subnets.gateway_ip AS subnets_gateway_ip, subnets.enable_dhcp AS subnets_enable_dhcp, subnets.shared AS subnets_shared, subnets.ipv6_ra_mode AS subnets_ipv6_ra_mode, subnets.ipv6_address_mode AS subnets_ipv6_address_mode, ipallocationpools_1.id AS ipallocationpools_1_id, ipallocationpools_1.subnet_id AS ipallocationpools_1_subnet_id, ipallocationpools_1.first_ip AS ipallocationpools_1_first_ip, ipallocationpools_1.last_ip AS ipallocationpools_1_last_ip, dnsnameservers_1.address AS dnsnameservers_1_address, dnsnameservers_1.subnet_id AS dnsnameservers_1_subnet_id, subnetroutes_1.destination AS subnetroutes_1_destination, subnetroutes_1.nexthop AS subnetroutes_1_nexthop, subnetroutes_1.subnet_id AS subnetroutes_1_subnet_id \nFROM subnets LEFT OUTER JOIN ipallocationpools AS ipallocationpools_1 ON subnets.id = ipallocationpools_1.subnet_id LEFT OUTER JOIN dnsnameservers AS dnsnameservers_1 ON subnets.id = dnsnameservers_1.subnet_id LEFT OUTER JOIN subnetroutes AS subnetroutes_1 ON subnets.id = subnetroutes_1.subnet_id \nWHERE subnets.subnetpool_id = %(subnetpool_id_1)s FOR UPDATE' {'subnetpool_id_1': u'2ee02a0f-863a-41d6-a5cb-7c629395b9da'}

Changed in neutron:
assignee: nobody → Cedric Brandily (cbrandily)
Changed in neutron:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/179955

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/179955
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=3682e3391f188845d0c7f382f0ccd4b38db3904e
Submitter: Jenkins
Branch: master

commit 3682e3391f188845d0c7f382f0ccd4b38db3904e
Author: Cedric Brandily <email address hidden>
Date: Mon May 4 23:36:19 2015 +0200

    Ensure non-overlapping cidrs in subnetpools without galera

    _get_allocated_cidrs[1] locks only allocated subnets in a subnetpool
    (with mysql/postgresql at least). It ensures we don't allocate a cidr
    overlapping with existent cidrs but nothing disallows a concurrent
    subnet allocation to create a subnet in the same subnetpool.

    This change replaces the lock on subnetpool subnets by a lock on the
    subnetpool itself. It disallows to allocate concurrently 2 subnets in
    the same subnetpool and ensure non-overlapping cidrs in the same
    subnetpool.

    Moreover this change solves a trouble with postgresql which disallows
    to lock an empty select with an outer join: it happens on first subnet
    allocation in a subnetpool when no specific cidr is provided. Moving
    the lock ensures the lock is done on a non-empty select.

    But this change does not ensure non-overlapping cidrs in subnetpools
    with galera because galera doesn't support SELECT FOR UPDATE locks. A
    follow-up change will (try to?) remove locks from subnet allocation[1]
    in order to ensure non-overlapping cidrs in subnetpools also with galera.

    [1] in neutron.ipam.subnet_alloc.SubnetAllocator

    Closes-Bug: #1451558
    Partial-Bug: #1451576
    Change-Id: I73854f9863f44621ae0d89c5dc4893ccc16d07e4

Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/191045

Thierry Carrez (ttx)
Changed in neutron:
milestone: none → liberty-1
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/qos)

Fix proposed to branch: feature/qos
Review: https://review.openstack.org/196097

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (feature/qos)
Download full text (93.9 KiB)

Reviewed: https://review.openstack.org/196097
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=1cfed745d54a6ce9cb3dd4e6f454666d9e6676c2
Submitter: Jenkins
Branch: feature/qos

commit ba7d673d1ddd5bfa5aa1be5b26a59e9a8cd78a9f
Author: Kevin Benton <email address hidden>
Date: Thu Jun 25 18:31:38 2015 -0700

    Remove duplicated call to setup_coreplugin

    The test case for vlan_transparent was calling setup_coreplugin
    before calling the super setUp method which already calls
    setup_coreplugin. This was causing duplicate core plugin fixtures
    which resulted in patching the dhcp periodic check twice.

    Change-Id: Ide4efad42748e799d8e9c815480c8ffa94b27b38
    Partial-Bug: #1468998

commit e64062efa3b793f7c4ce4ab9e62918af4f1bfcc9
Author: Kevin Benton <email address hidden>
Date: Thu Jun 25 18:29:37 2015 -0700

    Remove double mock of dhcp agent periodic check

    The test case for the periodic check was patching a target
    that the core plugin fixture already patched out. This removes
    that and exposes the mock from the fixture so the test case
    can reference it.

    Change-Id: I3adee6a875c497e070db4198567b52aa16b81ce8
    Partial-Bug: #1468998

commit 25ae0429a713143d42f626dd59ed4514ba25820c
Author: Kevin Benton <email address hidden>
Date: Thu Jun 25 18:24:10 2015 -0700

    Remove double fanout mock

    The test_mech_driver was duplicating a fanout mock already setup
    in the setUp routine.

    Change-Id: I5b88dff13113d55c72241d3d5025791a76672ac2
    Partial-Bug: #1468998

commit 993771556332d9b6bbf7eb3f0300cf9d8a2cb464
Author: Kevin Benton <email address hidden>
Date: Thu Jun 25 17:55:16 2015 -0700

    Remove double callback manager mocks

    setup_test_registry_instance() in the base test case class gives
    each test its own registry by mocking out the get_callback_manager.
    The L3 agent test cases were duplicating this.

    Partial-Bug: #1468998
    Change-Id: I7356daa846524611e9f92365939e8ad15d1e1cd8

commit 0be1efad93734f11cd63fb3b7bd2983442ce1268
Author: Kevin Benton <email address hidden>
Date: Thu Jun 25 16:57:30 2015 -0700

    Remove ensure_dirs double-patch

    test_spawn_radvd called mock.patch on ensure_dirs after the
    setup method already patched it out. This causes issues when
    mock.patch.stopall() is called because the mocks are stored
    as a set and are unwound in a non-deterministic fashion.[1]
    So some of the time they will be undone correctly, but others
    will leave a monkey-patched in mock, causing the ensure_dir
    test to fail.

    1. http://bugs.python.org/issue21239

    Closes-Bug: #1467908
    Change-Id: I321b5fed71dc73bd19b5099311c6f43640726cd4

commit 0a2238e34e72c17ca8a75e36b1f56e41a3ece74e
Author: Sukhdev Kapur <email address hidden>
Date: Thu Jun 25 15:11:28 2015 -0700

    Fix tenant-id in Arista ML2 driver to support HA router

    When HA router is created, the framework creates a network and does
    not specify the tenant-id. This casuse Arista ML2 driver to fail.
    This patch sets the tenant-id when it is not passed explicitly by
    by the network_create() call from the HA r...

tags: added: in-feature-qos
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/pecan)

Fix proposed to branch: feature/pecan
Review: https://review.openstack.org/196701

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (feature/pecan)

Change abandoned by Kyle Mestery (<email address hidden>) on branch: feature/pecan
Review: https://review.openstack.org/196701
Reason: This is lacking the functional fix [1], so I'll propose a new merge commit which includes that one.

[1] https://review.openstack.org/#/c/196711/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/pecan)

Fix proposed to branch: feature/pecan
Review: https://review.openstack.org/196920

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (feature/pecan)
Download full text (171.5 KiB)

Reviewed: https://review.openstack.org/196920
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=7f759c077f8f860c13db92d2ea6b353ef6b70900
Submitter: Jenkins
Branch: feature/pecan

commit 8123144fadd7c5d5e6e56a76ea860512619a2cf6
Author: Moshe Levi <email address hidden>
Date: Sun Jun 28 14:37:14 2015 +0300

    Fix Consolidate sriov agent and driver code

    This patch add mising __init to mech_sriov/mech_driver/
    and update the setup.cfg to the new agent entrypoint

    Trivial Fix

    Change-Id: I53a527081feb78472f496675bbb3c5121d38a14a

commit 8942fccf02e6e179d47582fdb2792a1ca972da21
Author: Assaf Muller <email address hidden>
Date: Mon Jun 29 11:38:51 2015 -0400

    Remove failing SafeFixture tests

    The fixtures 1.3 release attempted to fix the fixtures resource
    leak issue, but failed to do so completely. Our own SafeFixture
    is still needed: The 1.3 release broke our SafeFixture tests,
    but not the usage of SafeFixture itself. This patch removes
    those failing tests for now to unbreak the gate. Jakub reported
    a bug on fixtures 1.3:
    https://bugs.launchpad.net/python-fixtures/+bug/1469759

    We will continue to use SafeFixture until that bug is fixed
    in fixtures, at which point we will be able to require
    fixtures > 1.3.

    Change-Id: I59457c3bb198ff86d5ad55a1e623d008f0034b8f
    Closes-Bug: #1469734

commit 71dffb0a2c1720cd8233a329d32958a0160dd6f5
Author: Kevin Benton <email address hidden>
Date: Mon Jun 29 08:27:41 2015 +0000

    Revert "Removed test_lib module"

    This reverts commit 9a6536de6e1a7fe9b2552adc142e254426b82b6f.

    We pulled all of the plugins out of the tree, many of which still inherit
    from neutron test classes. This change then stated that we no longer
    support testing other plugins. I think this is a bit premature and should
    have been discussed under the subject
    "Neutron plugins can't use neutron plugin unit tests" or something
    similar.

    Change-Id: I68318589f010b731574ea3bfa8df98492bab31fc

commit b20fd81dbd497e058384a0af065dd0f1fdc4c728
Author: Jakub Libosvar <email address hidden>
Date: Fri Jun 5 14:32:51 2015 +0000

    Refactor NetcatTester class

    Following capabilities were added:
       - used transport protocol is passed as a constant instead of bool
       - src port for testing was added
       - connection can be established explicitly
       - change constructor parameters of NetcatTester

    As a part of removing bool for protocol definition
    get_free_namespace_port() was also modified to match the behavior.

    Change-Id: Id2ec322e7f731c05a3754a65411c9a5d8b258126

commit 83e37980dcd0b2bad6d64dd2cb23bcd2891cafca
Author: jingliuqing <email address hidden>
Date: Sat Jun 27 13:41:54 2015 +0800

    Use REST rather than ReST

    Change-Id: I06c9deaab58c5ec13bfeec39fb8fd4b1fe21f42d

commit 1b60df85ba3ad442c2e4e7e52538e1b9a1bf9378
Author: Kevin Benton <email address hidden>
Date: Thu Jun 25 18:34:38 2015 -0700

    Add a double-mock guard to the base test case

    Use mock to patch mock with a check to prevent multiple active
    patches to the...

tags: added: in-feature-pecan
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/kilo)

Reviewed: https://review.openstack.org/191045
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=cca48ad44d43142ab40971a96870417996db0f26
Submitter: Jenkins
Branch: stable/kilo

commit cca48ad44d43142ab40971a96870417996db0f26
Author: Cedric Brandily <email address hidden>
Date: Mon May 4 23:36:19 2015 +0200

    Ensure non-overlapping cidrs in subnetpools without galera

    _get_allocated_cidrs[1] locks only allocated subnets in a subnetpool
    (with mysql/postgresql at least). It ensures we don't allocate a cidr
    overlapping with existent cidrs but nothing disallows a concurrent
    subnet allocation to create a subnet in the same subnetpool.

    This change replaces the lock on subnetpool subnets by a lock on the
    subnetpool itself. It disallows to allocate concurrently 2 subnets in
    the same subnetpool and ensure non-overlapping cidrs in the same
    subnetpool.

    Moreover this change solves a trouble with postgresql which disallows
    to lock an empty select with an outer join: it happens on first subnet
    allocation in a subnetpool when no specific cidr is provided. Moving
    the lock ensures the lock is done on a non-empty select.

    But this change does not ensure non-overlapping cidrs in subnetpools
    with galera because galera doesn't support SELECT FOR UPDATE locks. A
    follow-up change will (try to?) remove locks from subnet allocation[1]
    in order to ensure non-overlapping cidrs in subnetpools also with galera.

    [1] in neutron.ipam.subnet_alloc.SubnetAllocator

    Closes-Bug: #1451558
    Partial-Bug: #1451576
    Change-Id: I73854f9863f44621ae0d89c5dc4893ccc16d07e4
    (cherry picked from commit 3682e3391f188845d0c7f382f0ccd4b38db3904e)
    Conflicts:
     neutron/ipam/subnet_alloc.py

tags: added: in-stable-kilo
Thierry Carrez (ttx)
Changed in neutron:
milestone: liberty-1 → 7.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.