Creation of existing resource takes too much time or fails
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
High
|
Jakub Libosvar |
Bug Description
We have a downstream failure of neutron_
That is because Neutron server retries for every exception about existing entry from the DB layer:
https:/
Steps to reproduce:
net_id=$(openstack network create rbac_net | awk '/ id /{ print $4 }')
openstack network rbac create --type network --action access_as_shared --target-project admin $net_id
openstack network rbac create --type network --action access_as_shared --target-project admin $net_id
( yes, it's the same command twice)
I don't understand which race scenario the retry mechanism for resource create it tries to solve. However, I can think of a race scenario it introduces:
$ openstack network rbac delete 8a00a24e-
$ openstack network delete $net_id
$ net_id=$(openstack network create rbac_net | awk '/ id /{ print $4 }')
$ rbac_id=$(openstack network rbac create --type network --action access_as_shared --target-project admin $net_id | awk '/ id /{ print $4 }')
$ openstack network rbac create --type network --action access_as_shared --target-project admin $net_id &
[1] 31383
$ sleep 10
$ openstack network rbac delete $rbac_id
$ fg
openstack network rbac create --type network --action access_as_shared --target-project admin $net_id
+------
| Field | Value |
+------
| action | access_as_shared |
| id | 13c5b655-
| location | Munch({'cloud': '', 'region_name': 'regionOne', 'zone': None, 'project': Munch({'id': 'cdf84b19b71249
| name | None |
| object_id | 618108b7-
| object_type | network |
| project_id | cdf84b19b71249f
| target_project_id | cdf84b19b71249f
+------
The result should be that second creation of existing resource should fail and there should not exist any rbac policy. However, the second creation succeeded and there does exist the policy, that should have been deleted.
Changed in neutron: | |
status: | New → Confirmed |
Changed in neutron: | |
importance: | Undecided → High |
tags: | added: rocky-backport-potential |
tags: | added: stein-backport-potenatial |
tags: | removed: rocky-backport-potential |
tags: | added: neutron-proactive-backport-potential |
tags: | removed: neutron-proactive-backport-potential |
I digged into why do we retry on DBDuplicateEntry exception and found https:/ /launchpad. net/neutron/ +bug/1594796 with the fix https:/ /review. opendev. org/#/c/ 332487/
Basically the reason is that if we generate entries, it may happen two processes concurrently generate the same resource and then we need to retry on the process that caught the duplicated entry.