Race condition in designate-sink when deleting records

Bug #1947765 reported by Albert Braden
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Designate
Fix Released
High
Unassigned
kolla-ansible
Invalid
Undecided
Unassigned

Bug Description

What happened: After rebuilding clusters from Queens to Train, deleting a VM and then immediately creating a new one with the same name and IP (via Terraform for example) intermittently results in a missing DNS record.

Before applying the change, we see the DNS record in the recordset:
$ openstack recordset list dva3.<DOM>. --all |grep openstack-terra
| f9aa73c1-84ba-4854-be71-cbb616de672c | 8d1c84082a044a53abe0d519ed9e8c60 | openstack-terra-test-host.dev-ostck.dva3.<DOM>. | A | <IP> | ACTIVE | NONE |
$

and we can pull it from the DNS server on the controllers:

$ for i in {1..3}; do dig @dva3-ctrl${i}.cloud.<DOM> -t axfr dva3.<DOM>. |grep openstack-terra; done
openstack-terra-test-host.dev-ostck.dva3.<DOM>. 1 IN A <IP>
openstack-terra-test-host.dev-ostck.dva3.<DOM>. 1 IN A <IP>
openstack-terra-test-host.dev-ostck.dva3.<<DOM>. 1 IN A <IP>

After applying the change, we don't see it:

$ openstack recordset list dva3.<DOM>. --all |grep openstack-terra
$ for i in {1..3}; do dig @dva3-ctrl${i}.cloud.<DOM> -t axfr dva3.<DOM>. |grep openstack-terra; done
$

We see this in the logs:

2021-10-09 01:53:44.307 27 ERROR oslo_messaging.notify.dispatcher oslo_db.exception.DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, "Duplicate entry 'c70e693b4c47402db088c43a5a177134-openstack-terra-test-host.de...' for key 'unique_recordset'")

2021-10-09 01:53:44.307 27 ERROR oslo_messaging.notify.dispatcher [SQL: INSERT INTO recordsets (id, version, created_at, zone_shard, tenant_id, zone_id, name, type, ttl, reverse_name) VALUES (%(id)s, %(version)s, %(created_at)s, %(zone_shard)s, %(tenant_id)s, %(zone_id)s, %(name)s, %(type)s, %(ttl)s, %(reverse_name)s)]

2021-10-09 01:53:44.307 27 ERROR oslo_messaging.notify.dispatcher [parameters: {'id': 'dbbb904c347241a791aa01ca33a87b23', 'version': 1, 'created_at': datetime.datetime(2021, 10, 9, 1, 53, 44, 182652), 'zone_shard': 3184, 'tenant_id': '8d1c84082a044a53abe0d519ed9e8c60', 'zone_id': 'c70e693b4c47402db088c43a5a177134', 'name': 'openstack-terra-test-host.dev-ostck.dva3.<DOM>.', 'type': 'A', 'ttl': None, 'reverse_name': '<MOD>.3avd.kctso-ved.tsoh-tset-arret-kcatsnepo'}]

It appears that Designate is trying to create the new record before the deletion of the old one finishes.

Expected behavior: Designate waits for the old record to finish deleting before attempting to create the new one.

How to reproduce: Use Terraform to create a VM with a port. Taint the VM and run Terraform apply so that the VM is deleted and immediately re-created with the same name and IP.

OS: Centos 8
Kernel: 4.18.0-305.19.1.el8_4.x86_64
Docker version: 20.10.7
Kolla-Ansible version: stable/train

Changed in designate:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to designate (master)

Reviewed: https://review.opendev.org/c/openstack/designate/+/814290
Committed: https://opendev.org/openstack/designate/commit/4807c23228fbb23af3087ee072bf73d7fc43aff5
Submitter: "Zuul (22348)"
Branch: master

commit 4807c23228fbb23af3087ee072bf73d7fc43aff5
Author: Erik Olof Gunnar Andersson <email address hidden>
Date: Sun Oct 17 03:05:28 2021 -0700

    Fix race condition in the sink when deleting records

    Updated the sink to behave closer to how we handle this type
    of operations in designate.api.v2.

    - Added object validation to all requests.
    - Better test coverage.
    - Use recordset update / delete instead of just record delete.

    Closes-Bug: #1947765
    Change-Id: I867600eb48a3e30a4d17471ab794ca717706823d

Changed in designate:
status: Confirmed → Fix Released
Changed in kolla-ansible:
status: New → Invalid
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to designate (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/designate/+/817540

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to designate (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/designate/+/817541

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to designate (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/designate/+/817542

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to designate (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/designate/+/817542
Committed: https://opendev.org/openstack/designate/commit/8634d531f2d993901527ecc67ddc40eb9dc0a575
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 8634d531f2d993901527ecc67ddc40eb9dc0a575
Author: Erik Olof Gunnar Andersson <email address hidden>
Date: Sun Oct 17 03:05:28 2021 -0700

    Fix race condition in the sink when deleting records

    Updated the sink to behave closer to how we handle this type
    of operations in designate.api.v2.

    - Added object validation to all requests.
    - Better test coverage.
    - Use recordset update / delete instead of just record delete.

    Closes-Bug: #1947765
    Change-Id: I867600eb48a3e30a4d17471ab794ca717706823d
    (cherry picked from commit 4807c23228fbb23af3087ee072bf73d7fc43aff5)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to designate (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/designate/+/817541
Committed: https://opendev.org/openstack/designate/commit/04a6d3156ae6be9bbb3b50d618694af0e15352e7
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 04a6d3156ae6be9bbb3b50d618694af0e15352e7
Author: Erik Olof Gunnar Andersson <email address hidden>
Date: Sun Oct 17 03:05:28 2021 -0700

    Fix race condition in the sink when deleting records

    Updated the sink to behave closer to how we handle this type
    of operations in designate.api.v2.

    - Added object validation to all requests.
    - Better test coverage.
    - Use recordset update / delete instead of just record delete.

    Closes-Bug: #1947765
    Change-Id: I867600eb48a3e30a4d17471ab794ca717706823d
    (cherry picked from commit 4807c23228fbb23af3087ee072bf73d7fc43aff5)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to designate (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/designate/+/817540
Committed: https://opendev.org/openstack/designate/commit/1a2c3ca74b73f20508fcbea4e344f4e2976be931
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 1a2c3ca74b73f20508fcbea4e344f4e2976be931
Author: Erik Olof Gunnar Andersson <email address hidden>
Date: Sun Oct 17 03:05:28 2021 -0700

    Fix race condition in the sink when deleting records

    Updated the sink to behave closer to how we handle this type
    of operations in designate.api.v2.

    - Added object validation to all requests.
    - Better test coverage.
    - Use recordset update / delete instead of just record delete.

    Closes-Bug: #1947765
    Change-Id: I867600eb48a3e30a4d17471ab794ca717706823d
    (cherry picked from commit 4807c23228fbb23af3087ee072bf73d7fc43aff5)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/designate 12.0.1

This issue was fixed in the openstack/designate 12.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/designate 11.0.1

This issue was fixed in the openstack/designate 11.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/designate 14.0.0.0rc1

This issue was fixed in the openstack/designate 14.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/designate 13.0.1

This issue was fixed in the openstack/designate 13.0.1 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.