Insertion of a duplicated ProviderResourceAssociation entry while creating a HA router

Bug #2035573 reported by Jimin Shin
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
neutron
Incomplete
Undecided
Unassigned

Bug Description

[SUMMARY]
While creating multiple HA routers in a row, some router creations are failed with the log below.

2023-07-21 02:28:19.968 12 DEBUG neutron.wsgi [-] (12) accepted ('10.2.41.158', 57438) server /var/lib/kolla/venv/lib/python3.10/site-packages/eventlet/wsgi.py:1004
2023-07-21 02:28:19.979 12 DEBUG neutron.api.v2.base [None req-726ab30e-b9c9-4e38-8c9d-216752ab2aea 439ae1ccaa9a494284cee3fdb6227208 97895007888245c3acdfc41146d2e151 - - default default] Request body: {'router': {'name': 'test6', 'admin_state_up': True, 'tenant_id': '97895007888245c3acdfc41146d2e151'}} prepare_request_body /var/lib/kolla/venv/lib/python3.10/site-packages/neutron/api/v2/base.py:731
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db [None req-726ab30e-b9c9-4e38-8c9d-216752ab2aea 439ae1ccaa9a494284cee3fdb6227208 97895007888245c3acdfc41146d2e151 - - default default] Failed to schedule HA router 5a599ade-7b6a-4b3e-b635-ca00e37f2657.: neutron_lib.objects.exceptions.NeutronDbObjectDuplicateEntry: Failed to create a duplicate ProviderResourceAssociation: for attribute(s) ['PRIMARY'] with value(s) ha-5a599ade-7b6a-4b3e-b635-ca00e37f2657
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db Traceback (most recent call last):
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     self.dialect.do_execute(
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     cursor.execute(statement, parameters)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/cursors.py", line 148, in execute
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     result = self._query(query)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/cursors.py", line 310, in _query
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     conn.query(q)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/connections.py", line 548, in query
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/connections.py", line 775, in _read_query_result
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     result.read()
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/connections.py", line 1156, in read
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     first_packet = self.connection._read_packet()
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/connections.py", line 725, in _read_packet
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     packet.raise_for_error()
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/protocol.py", line 221, in raise_for_error
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     err.raise_mysql_exception(self._data)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/err.py", line 143, in raise_mysql_exception
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     raise errorclass(errno, errval)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db pymysql.err.IntegrityError: (1062, "Duplicate entry 'ha-5a599ade-7b6a-4b3e-b635-ca00e37f2657' for key 'PRIMARY'")
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db The above exception was the direct cause of the following exception:
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db Traceback (most recent call last):
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/neutron/objects/base.py", line 903, in create
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     db_obj = obj_db_api.create_object(
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/neutron/objects/db/api.py", line 72, in create_object
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     with obj_cls.db_context_writer(context):
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/usr/lib/python3.10/contextlib.py", line 142, in __exit__
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     next(self.gen)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 1066, in _transaction_scope
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     with current._produce_block(
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/usr/lib/python3.10/contextlib.py", line 142, in __exit__
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     next(self.gen)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 688, in _session
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     self.session.flush()
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 3386, in flush
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     self._flush(objects)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 3525, in _flush
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     with util.safe_reraise():
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     compat.raise_(
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 208, in raise_
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     raise exception
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 3486, in _flush
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     flush_context.execute()
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/orm/unitofwork.py", line 456, in execute
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     rec.execute(self)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/orm/unitofwork.py", line 630, in execute
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     util.preloaded.orm_persistence.save_obj(
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     _emit_insert_statements(
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/orm/persistence.py", line 1097, in _emit_insert_statements
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     c = connection._execute_20(
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1705, in _execute_20
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     return meth(self, args_10style, kwargs_10style, execution_options)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 333, in _execute_on_connection
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     return connection._execute_clauseelement(
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1572, in _execute_clauseelement
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     ret = self._execute_context(
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1943, in _execute_context
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     self._handle_dbapi_exception(
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2122, in _handle_dbapi_exception
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     util.raise_(newraise, with_traceback=exc_info[2], from_=e)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 208, in raise_
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     raise exception
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     self.dialect.do_execute(
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     cursor.execute(statement, parameters)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/cursors.py", line 148, in execute
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     result = self._query(query)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/cursors.py", line 310, in _query
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     conn.query(q)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/connections.py", line 548, in query
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/connections.py", line 775, in _read_query_result
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     result.read()
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/connections.py", line 1156, in read
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     first_packet = self.connection._read_packet()
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/connections.py", line 725, in _read_packet
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     packet.raise_for_error()
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/protocol.py", line 221, in raise_for_error
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     err.raise_mysql_exception(self._data)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db   File "/var/lib/kolla/venv/lib/python3.10/site-packages/pymysql/err.py", line 143, in raise_mysql_exception
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db     raise errorclass(errno, errval)
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db oslo_db.exception.DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, "Duplicate entry 'ha-5a599ade-7b6a-4b3e-b635-ca00e37f2657' for key 'PRIMARY'")
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db [SQL: INSERT INTO providerresourceassociations (provider_name, resource_id) VALUES (%(provider_name)s, %(resource_id)s)]
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db [parameters: {'provider_name': 'ha', 'resource_id': '5a599ade-7b6a-4b3e-b635-ca00e37f2657'}]
2023-07-21 02:28:20.178 12 ERROR neutron.db.l3_hamode_db (Background on this error at: https://sqlalche.me/e/14/gkpj)

[REPRODUCTION STEPS]
Create HA routers in a row using the script below. More than 10 times of script execution would be needed.

#!/bin/bash

openstack router create 1
openstack router create 2
openstack router create 3
openstack router create 4
openstack router create 5
openstack router delete 1 2 3 4 5

[VERSION]
OpenStack version: stable/2023.1
OS: Ubuntu 22.04
Others: openvswitch 2.17.7

[ENVIRONMENT]
Installed on K8S nodes using openstack-helm-infra chart version 0.3.14. Currently running pods are list below.

$ kubectl get pod -A -l application=neutron
NAMESPACE NAME READY STATUS RESTARTS AGE
openstack neutron-db-init-qvc5r 0/1 Completed 0 43h
openstack neutron-db-sync-xcsw8 0/1 Completed 0 43h
openstack neutron-dhcp-agent-default-6nzgw 1/1 Running 0 43h
openstack neutron-dhcp-agent-default-rbxk7 1/1 Running 0 43h
openstack neutron-dhcp-agent-default-s7jrb 1/1 Running 0 43h
openstack neutron-ks-endpoints-qqrn9 0/3 Completed 0 43h
openstack neutron-ks-service-2v9ls 0/1 Completed 0 43h
openstack neutron-ks-user-d74c5 0/1 Completed 0 43h
openstack neutron-l3-agent-default-49lht 1/1 Running 0 43h
openstack neutron-l3-agent-default-7s7bh 1/1 Running 0 43h
openstack neutron-l3-agent-default-l48mv 1/1 Running 0 43h
openstack neutron-metadata-agent-default-6r8xf 1/1 Running 0 43h
openstack neutron-metadata-agent-default-gzf29 1/1 Running 0 43h
openstack neutron-metadata-agent-default-mrlmd 1/1 Running 0 43h
openstack neutron-netns-cleanup-cron-default-45cbq 1/1 Running 0 43h
openstack neutron-netns-cleanup-cron-default-bktfq 1/1 Running 0 43h
openstack neutron-netns-cleanup-cron-default-pf9wg 1/1 Running 0 43h
openstack neutron-ovs-agent-default-8c87c 1/1 Running 0 43h
openstack neutron-ovs-agent-default-gg7pc 1/1 Running 0 43h
openstack neutron-ovs-agent-default-jll4x 1/1 Running 0 43h
openstack neutron-ovs-agent-default-mxr5g 1/1 Running 0 43h
openstack neutron-ovs-agent-default-tjp59 1/1 Running 0 43h
openstack neutron-ovs-agent-default-vqz44 1/1 Running 0 43h
openstack neutron-ovs-agent-default-wbcwx 1/1 Running 0 43h
openstack neutron-rabbit-init-gzg62 0/1 Completed 0 43h
openstack neutron-server-c77b7bdcd-4jq5m 1/1 Running 0 43h
openstack neutron-server-c77b7bdcd-4tzc9 1/1 Running 0 43h
openstack neutron-server-c77b7bdcd-5fwnw 1/1 Running 0 43h
openstack neutron-server-c77b7bdcd-6txp2 1/1 Running 0 43h
openstack neutron-server-c77b7bdcd-clm65 1/1 Running 0 43h
openstack neutron-server-c77b7bdcd-j95n2 1/1 Running 0 43h
openstack neutron-server-c77b7bdcd-khkkg 1/1 Running 0 43h
openstack neutron-server-c77b7bdcd-l8qdm 1/1 Running 0 43h
openstack neutron-server-c77b7bdcd-mv7nk 1/1 Running 0 43h
openstack neutron-server-c77b7bdcd-ncszs 1/1 Running 0 43h
openstack neutron-server-c77b7bdcd-p6m4w 1/1 Running 0 43h
openstack neutron-server-c77b7bdcd-qxqbs 1/1 Running 0 43h

Tags: db l3-ha
Jimin Shin (zimiin)
description: updated
Revision history for this message
Miro Tomaska (mtomaska) wrote :

Hi Jimin,

I tried to reproduce but I was not successful. But my feeling is that this will not be easily reproducable anywhere.
1. Can you share the full log and when this issue happened
2. Did the issue occured with a neutron version upgrade or is this is the first time you tried to create routers in the manner you described?

Thanks!

tags: added: db l3-ha
Revision history for this message
Brian Haley (brian-haley) wrote :

This looks very similar to this bug that was just fixed:

https://bugs.launchpad.net/neutron/+bug/2016198

That required DB changes that don't make it a backport candidate I believe if you look at all the related changes.

Revision history for this message
Jimin Shin (zimiin) wrote (last edit ):

Hello Brian:

That bug report seems to be happened when the new network is created for the router. However, this error occurs even though there's a network available - the new network is not created.

Thank you!

Revision history for this message
Jimin Shin (zimiin) wrote :

Hello Miro:

1. What component's log would be needed? (neutron-server, neutron-l3-agent, etc.)
2. This is my first time.

Thank you!

Revision history for this message
Brian Haley (brian-haley) wrote :

Deleting your last router would delete the subnet, then creating another will trigger a new allocation, and possibly the race condition. If you never delete one router (say, router 1), does the failure still happen? If not then it's the same bug.

If you could retry using code on the master branch that would be the best test.

Changed in neutron:
status: New → Incomplete
Revision history for this message
Jimin Shin (zimiin) wrote :

Hello Brian:

I created and deleted 2, 3, 4 router repeatedly while having router 1, but the error still occurs.

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello:

As Brian commented, this issue was solved in [1]. The problem is that this fix includes a DB migration and cannot be backported. Please upgrade to stable/2023.2, if possible, and check again.

Regards.

[1]https://bugs.launchpad.net/neutron/+bug/2016198

Revision history for this message
Jimin Shin (zimiin) wrote :

Hello Rodolfo:

As the description of the bug report [1], race condition happens when the new network (or a subnet) is created.
However, this issue happens even though there is no network creation while creating a router. In the test I did earlier, I had left the router #1 remains, and created and deleted router #2, 3, 4 repeatedly, error occured as well.

Thank you

Revision history for this message
ChungWon Lee (cw0306-lee) wrote :

I'm coworker of Jimin Shin, and this problem also happens when we upgrade neutron to Bobcat version.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.