Bug #1785459 “PostgreSQL database deadlocks under high load” : Bugs : Designate

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-04-02: Related fix merged to designate (master)

#1

Reviewed: https://review.openstack.org/647711
Committed: https://git.openstack.org/cgit/openstack/designate/commit/?id=f828654a3d40476cac7eb24a09a36e9978c2d708
Submitter: Zuul
Branch: master

commit f828654a3d40476cac7eb24a09a36e9978c2d708
Author: Takahito Hirose <email address hidden>
Date: Tue Mar 26 19:52:33 2019 +0900

Fix DBDeadLock error resulting into 500

When user requests the record registration request continuously,
sometimes designate hits DBDeadLock resuting into 500 InternalServerError.

We get below error:

    2019-02-21 21:30:39.925 49752 ERROR designate.api.middleware RemoteError:
    Remote error: DBDeadlock (pymysql.err.InternalError)
    (1213, u'Deadlock found when trying to get lock; try restarting transaction')
    [SQL: u'UPDATE records SET version=(records.version + %(version_1)s),
    updated_at=%(updated_at)s, data=%(data)s, hash=%(hash)s, status=%(status)s,
    action=%(action)s, serial=%(serial)s WHERE records.id = %(id_1)s']
    [parameters: {'status': 'PENDING', 'hash': '39795ee18c6e3c9ad1c0190c6a3d8d4f',
    'updated_at': datetime.datetime(2019, 2, 21, 12, 30, 39, 909846), u'version_1': 1,
    u'id_1': '7a655eeda4d446cdaa81caf19ab55fcc', 'action': 'UPDATE',
    'serial': 1550752338,
    'data': u'ns2.example.jp. domain.example.com. 1550752338 3552 600 86400 3600'}]

In the process of record registeration, designate first tried to update
the reocrd and then update the zone status.

    Updating the zone_status and registering the record process[1] and after synced
    update record_status and zone_status process[2] are in reverse order. So If user
    request the registering record many time and same time, Designate will get the
    DBDeadLock, when these processes run the same time.

We observed that changing the order of the operations solves this issue.

[1] https://github.com/openstack/designate/blob/master/designate/central/service.py#L1292-L1320
[2] https://github.com/openstack/designate/blob/master/designate/central/service.py#L2310-L2322

    1. transaction [1]-1 updating zone status process <- run ---> table_name-zone
    2. transaction [2]-1 updating record status process <- run ---> table_name-record
    3. transaction [1]-2 registering record process <- run and wait ---> table_name-record
    4. transaction [2]-2 updating zone process <-deadlock! ---> table_name-zone

Change-Id: Icd6e690ac84a2fe0db0f4a8a513de47f7916f5ea
Related-Bug: #1785459

Reviewed:  https://review.openstack.org/647711
Committed: https://git.openstack.org/cgit/openstack/designate/commit/?id=f828654a3d40476cac7eb24a09a36e9978c2d708
Submitter: Zuul
Branch:    master

commit f828654a3d40476cac7eb24a09a36e9978c2d708
Author: Takahito Hirose <takahito.hirose0518@gmail.com>
Date:   Tue Mar 26 19:52:33 2019 +0900

Fix DBDeadLock error resulting into 500
    
    When user requests the record registration request continuously,
    sometimes designate hits DBDeadLock resuting into 500 InternalServerError.
    
    We get below error:
    
    2019-02-21 21:30:39.925 49752 ERROR designate.api.middleware RemoteError:
    Remote error: DBDeadlock (pymysql.err.InternalError)
    (1213, u'Deadlock found when trying to get lock; try restarting transaction')
    [SQL: u'UPDATE records SET version=(records.version + %(version_1)s),
    updated_at=%(updated_at)s, data=%(data)s, hash=%(hash)s, status=%(status)s,
    action=%(action)s, serial=%(serial)s WHERE records.id = %(id_1)s']
    [parameters: {'status': 'PENDING', 'hash': '39795ee18c6e3c9ad1c0190c6a3d8d4f',
    'updated_at': datetime.datetime(2019, 2, 21, 12, 30, 39, 909846), u'version_1': 1,
    u'id_1': '7a655eeda4d446cdaa81caf19ab55fcc', 'action': 'UPDATE',
    'serial': 1550752338,
    'data': u'ns2.example.jp. domain.example.com. 1550752338 3552 600 86400 3600'}]
    
    In the process of record registeration, designate first tried to update
    the reocrd and then update the zone status.
    
    Updating the zone_status and registering the record process[1] and after synced
    update record_status and zone_status process[2] are in reverse order. So If user
    request the registering record many time and same time, Designate will get the
    DBDeadLock, when these processes run the same time.
    
    We observed that changing the order of the operations solves this issue.
    
    [1] https://github.com/openstack/designate/blob/master/designate/central/service.py#L1292-L1320
    [2] https://github.com/openstack/designate/blob/master/designate/central/service.py#L2310-L2322
    
    1. transaction [1]-1 updating zone status process <- run    ---> table_name-zone
    2. transaction [2]-1 updating record status process <- run  ---> table_name-record
    3. transaction [1]-2 registering record process <- run and wait  ---> table_name-record
    4. transaction [2]-2 updating zone process <-deadlock!           ---> table_name-zone
    
    Change-Id: Icd6e690ac84a2fe0db0f4a8a513de47f7916f5ea
    Related-Bug: #1785459

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-04-08: Related fix proposed to designate (stable/stein)

#2

Related fix proposed to branch: stable/stein
Review: https://review.openstack.org/650603

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-04-08: Related fix proposed to designate (stable/rocky)

#3

Related fix proposed to branch: stable/rocky
Review: https://review.openstack.org/650604

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-04-08: Related fix proposed to designate (stable/queens)

#4

Related fix proposed to branch: stable/queens
Review: https://review.openstack.org/650605

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-04-08: Related fix proposed to designate (stable/pike)

#5

Related fix proposed to branch: stable/pike
Review: https://review.openstack.org/650606

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-04-08: Related fix proposed to designate (stable/ocata)

#6

Related fix proposed to branch: stable/ocata
Review: https://review.openstack.org/650607

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-05-08: Related fix merged to designate (stable/stein)

#7

Reviewed: https://review.opendev.org/650603
Committed: https://git.openstack.org/cgit/openstack/designate/commit/?id=9b4b876d6ae3298110e15b6998306a2cf5660c3f
Submitter: Zuul
Branch: stable/stein

commit 9b4b876d6ae3298110e15b6998306a2cf5660c3f
Author: Takahito Hirose <email address hidden>
Date: Tue Mar 26 19:52:33 2019 +0900

Fix DBDeadLock error resulting into 500

When user requests the record registration request continuously,
sometimes designate hits DBDeadLock resuting into 500 InternalServerError.

We get below error:

    2019-02-21 21:30:39.925 49752 ERROR designate.api.middleware RemoteError:
    Remote error: DBDeadlock (pymysql.err.InternalError)
    (1213, u'Deadlock found when trying to get lock; try restarting transaction')
    [SQL: u'UPDATE records SET version=(records.version + %(version_1)s),
    updated_at=%(updated_at)s, data=%(data)s, hash=%(hash)s, status=%(status)s,
    action=%(action)s, serial=%(serial)s WHERE records.id = %(id_1)s']
    [parameters: {'status': 'PENDING', 'hash': '39795ee18c6e3c9ad1c0190c6a3d8d4f',
    'updated_at': datetime.datetime(2019, 2, 21, 12, 30, 39, 909846), u'version_1': 1,
    u'id_1': '7a655eeda4d446cdaa81caf19ab55fcc', 'action': 'UPDATE',
    'serial': 1550752338,
    'data': u'ns2.example.jp. domain.example.com. 1550752338 3552 600 86400 3600'}]

In the process of record registeration, designate first tried to update
the reocrd and then update the zone status.

    Updating the zone_status and registering the record process[1] and after synced
    update record_status and zone_status process[2] are in reverse order. So If user
    request the registering record many time and same time, Designate will get the
    DBDeadLock, when these processes run the same time.

We observed that changing the order of the operations solves this issue.

[1] https://github.com/openstack/designate/blob/master/designate/central/service.py#L1292-L1320
[2] https://github.com/openstack/designate/blob/master/designate/central/service.py#L2310-L2322

    1. transaction [1]-1 updating zone status process <- run ---> table_name-zone
    2. transaction [2]-1 updating record status process <- run ---> table_name-record
    3. transaction [1]-2 registering record process <- run and wait ---> table_name-record
    4. transaction [2]-2 updating zone process <-deadlock! ---> table_name-zone

    Change-Id: Icd6e690ac84a2fe0db0f4a8a513de47f7916f5ea
    Related-Bug: #1785459
    (cherry picked from commit f828654a3d40476cac7eb24a09a36e9978c2d708)

Reviewed:  https://review.opendev.org/650603
Committed: https://git.openstack.org/cgit/openstack/designate/commit/?id=9b4b876d6ae3298110e15b6998306a2cf5660c3f
Submitter: Zuul
Branch:    stable/stein

commit 9b4b876d6ae3298110e15b6998306a2cf5660c3f
Author: Takahito Hirose <takahito.hirose0518@gmail.com>
Date:   Tue Mar 26 19:52:33 2019 +0900

Fix DBDeadLock error resulting into 500
    
    When user requests the record registration request continuously,
    sometimes designate hits DBDeadLock resuting into 500 InternalServerError.
    
    We get below error:
    
    2019-02-21 21:30:39.925 49752 ERROR designate.api.middleware RemoteError:
    Remote error: DBDeadlock (pymysql.err.InternalError)
    (1213, u'Deadlock found when trying to get lock; try restarting transaction')
    [SQL: u'UPDATE records SET version=(records.version + %(version_1)s),
    updated_at=%(updated_at)s, data=%(data)s, hash=%(hash)s, status=%(status)s,
    action=%(action)s, serial=%(serial)s WHERE records.id = %(id_1)s']
    [parameters: {'status': 'PENDING', 'hash': '39795ee18c6e3c9ad1c0190c6a3d8d4f',
    'updated_at': datetime.datetime(2019, 2, 21, 12, 30, 39, 909846), u'version_1': 1,
    u'id_1': '7a655eeda4d446cdaa81caf19ab55fcc', 'action': 'UPDATE',
    'serial': 1550752338,
    'data': u'ns2.example.jp. domain.example.com. 1550752338 3552 600 86400 3600'}]
    
    In the process of record registeration, designate first tried to update
    the reocrd and then update the zone status.
    
    Updating the zone_status and registering the record process[1] and after synced
    update record_status and zone_status process[2] are in reverse order. So If user
    request the registering record many time and same time, Designate will get the
    DBDeadLock, when these processes run the same time.
    
    We observed that changing the order of the operations solves this issue.
    
    [1] https://github.com/openstack/designate/blob/master/designate/central/service.py#L1292-L1320
    [2] https://github.com/openstack/designate/blob/master/designate/central/service.py#L2310-L2322
    
    1. transaction [1]-1 updating zone status process <- run    ---> table_name-zone
    2. transaction [2]-1 updating record status process <- run  ---> table_name-record
    3. transaction [1]-2 registering record process <- run and wait  ---> table_name-record
    4. transaction [2]-2 updating zone process <-deadlock!           ---> table_name-zone
    
    Change-Id: Icd6e690ac84a2fe0db0f4a8a513de47f7916f5ea
    Related-Bug: #1785459
    (cherry picked from commit f828654a3d40476cac7eb24a09a36e9978c2d708)

tags:

added: in-stable-stein

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-05-08: Related fix merged to designate (stable/rocky)

#8

Reviewed: https://review.opendev.org/650604
Committed: https://git.openstack.org/cgit/openstack/designate/commit/?id=2b4fbbf4d8c72a785cb1f8efe3ebdc2ee07a6a69
Submitter: Zuul
Branch: stable/rocky

commit 2b4fbbf4d8c72a785cb1f8efe3ebdc2ee07a6a69
Author: Takahito Hirose <email address hidden>
Date: Tue Mar 26 19:52:33 2019 +0900

Fix DBDeadLock error resulting into 500

When user requests the record registration request continuously,
sometimes designate hits DBDeadLock resuting into 500 InternalServerError.

We get below error:

    2019-02-21 21:30:39.925 49752 ERROR designate.api.middleware RemoteError:
    Remote error: DBDeadlock (pymysql.err.InternalError)
    (1213, u'Deadlock found when trying to get lock; try restarting transaction')
    [SQL: u'UPDATE records SET version=(records.version + %(version_1)s),
    updated_at=%(updated_at)s, data=%(data)s, hash=%(hash)s, status=%(status)s,
    action=%(action)s, serial=%(serial)s WHERE records.id = %(id_1)s']
    [parameters: {'status': 'PENDING', 'hash': '39795ee18c6e3c9ad1c0190c6a3d8d4f',
    'updated_at': datetime.datetime(2019, 2, 21, 12, 30, 39, 909846), u'version_1': 1,
    u'id_1': '7a655eeda4d446cdaa81caf19ab55fcc', 'action': 'UPDATE',
    'serial': 1550752338,
    'data': u'ns2.example.jp. domain.example.com. 1550752338 3552 600 86400 3600'}]

In the process of record registeration, designate first tried to update
the reocrd and then update the zone status.

    Updating the zone_status and registering the record process[1] and after synced
    update record_status and zone_status process[2] are in reverse order. So If user
    request the registering record many time and same time, Designate will get the
    DBDeadLock, when these processes run the same time.

We observed that changing the order of the operations solves this issue.

[1] https://github.com/openstack/designate/blob/master/designate/central/service.py#L1292-L1320
[2] https://github.com/openstack/designate/blob/master/designate/central/service.py#L2310-L2322

    1. transaction [1]-1 updating zone status process <- run ---> table_name-zone
    2. transaction [2]-1 updating record status process <- run ---> table_name-record
    3. transaction [1]-2 registering record process <- run and wait ---> table_name-record
    4. transaction [2]-2 updating zone process <-deadlock! ---> table_name-zone

    Change-Id: Icd6e690ac84a2fe0db0f4a8a513de47f7916f5ea
    Related-Bug: #1785459
    (cherry picked from commit f828654a3d40476cac7eb24a09a36e9978c2d708)

Reviewed:  https://review.opendev.org/650604
Committed: https://git.openstack.org/cgit/openstack/designate/commit/?id=2b4fbbf4d8c72a785cb1f8efe3ebdc2ee07a6a69
Submitter: Zuul
Branch:    stable/rocky

commit 2b4fbbf4d8c72a785cb1f8efe3ebdc2ee07a6a69
Author: Takahito Hirose <takahito.hirose0518@gmail.com>
Date:   Tue Mar 26 19:52:33 2019 +0900

Fix DBDeadLock error resulting into 500
    
    When user requests the record registration request continuously,
    sometimes designate hits DBDeadLock resuting into 500 InternalServerError.
    
    We get below error:
    
    2019-02-21 21:30:39.925 49752 ERROR designate.api.middleware RemoteError:
    Remote error: DBDeadlock (pymysql.err.InternalError)
    (1213, u'Deadlock found when trying to get lock; try restarting transaction')
    [SQL: u'UPDATE records SET version=(records.version + %(version_1)s),
    updated_at=%(updated_at)s, data=%(data)s, hash=%(hash)s, status=%(status)s,
    action=%(action)s, serial=%(serial)s WHERE records.id = %(id_1)s']
    [parameters: {'status': 'PENDING', 'hash': '39795ee18c6e3c9ad1c0190c6a3d8d4f',
    'updated_at': datetime.datetime(2019, 2, 21, 12, 30, 39, 909846), u'version_1': 1,
    u'id_1': '7a655eeda4d446cdaa81caf19ab55fcc', 'action': 'UPDATE',
    'serial': 1550752338,
    'data': u'ns2.example.jp. domain.example.com. 1550752338 3552 600 86400 3600'}]
    
    In the process of record registeration, designate first tried to update
    the reocrd and then update the zone status.
    
    Updating the zone_status and registering the record process[1] and after synced
    update record_status and zone_status process[2] are in reverse order. So If user
    request the registering record many time and same time, Designate will get the
    DBDeadLock, when these processes run the same time.
    
    We observed that changing the order of the operations solves this issue.
    
    [1] https://github.com/openstack/designate/blob/master/designate/central/service.py#L1292-L1320
    [2] https://github.com/openstack/designate/blob/master/designate/central/service.py#L2310-L2322
    
    1. transaction [1]-1 updating zone status process <- run    ---> table_name-zone
    2. transaction [2]-1 updating record status process <- run  ---> table_name-record
    3. transaction [1]-2 registering record process <- run and wait  ---> table_name-record
    4. transaction [2]-2 updating zone process <-deadlock!           ---> table_name-zone
    
    Change-Id: Icd6e690ac84a2fe0db0f4a8a513de47f7916f5ea
    Related-Bug: #1785459
    (cherry picked from commit f828654a3d40476cac7eb24a09a36e9978c2d708)

tags:

added: in-stable-rocky

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-06-17: Related fix merged to designate (stable/queens)

#9

Reviewed: https://review.opendev.org/650605
Committed: https://git.openstack.org/cgit/openstack/designate/commit/?id=00d0cb71e63e870fdf09833eec02f29b95542b18
Submitter: Zuul
Branch: stable/queens

commit 00d0cb71e63e870fdf09833eec02f29b95542b18
Author: Takahito Hirose <email address hidden>
Date: Tue Mar 26 19:52:33 2019 +0900

Fix DBDeadLock error resulting into 500

When user requests the record registration request continuously,
sometimes designate hits DBDeadLock resuting into 500 InternalServerError.

We get below error:

    2019-02-21 21:30:39.925 49752 ERROR designate.api.middleware RemoteError:
    Remote error: DBDeadlock (pymysql.err.InternalError)
    (1213, u'Deadlock found when trying to get lock; try restarting transaction')
    [SQL: u'UPDATE records SET version=(records.version + %(version_1)s),
    updated_at=%(updated_at)s, data=%(data)s, hash=%(hash)s, status=%(status)s,
    action=%(action)s, serial=%(serial)s WHERE records.id = %(id_1)s']
    [parameters: {'status': 'PENDING', 'hash': '39795ee18c6e3c9ad1c0190c6a3d8d4f',
    'updated_at': datetime.datetime(2019, 2, 21, 12, 30, 39, 909846), u'version_1': 1,
    u'id_1': '7a655eeda4d446cdaa81caf19ab55fcc', 'action': 'UPDATE',
    'serial': 1550752338,
    'data': u'ns2.example.jp. domain.example.com. 1550752338 3552 600 86400 3600'}]

In the process of record registeration, designate first tried to update
the reocrd and then update the zone status.

    Updating the zone_status and registering the record process[1] and after synced
    update record_status and zone_status process[2] are in reverse order. So If user
    request the registering record many time and same time, Designate will get the
    DBDeadLock, when these processes run the same time.

We observed that changing the order of the operations solves this issue.

[1] https://github.com/openstack/designate/blob/master/designate/central/service.py#L1292-L1320
[2] https://github.com/openstack/designate/blob/master/designate/central/service.py#L2310-L2322

    1. transaction [1]-1 updating zone status process <- run ---> table_name-zone
    2. transaction [2]-1 updating record status process <- run ---> table_name-record
    3. transaction [1]-2 registering record process <- run and wait ---> table_name-record
    4. transaction [2]-2 updating zone process <-deadlock! ---> table_name-zone

    Change-Id: Icd6e690ac84a2fe0db0f4a8a513de47f7916f5ea
    Related-Bug: #1785459
    (cherry picked from commit f828654a3d40476cac7eb24a09a36e9978c2d708)

Reviewed:  https://review.opendev.org/650605
Committed: https://git.openstack.org/cgit/openstack/designate/commit/?id=00d0cb71e63e870fdf09833eec02f29b95542b18
Submitter: Zuul
Branch:    stable/queens

commit 00d0cb71e63e870fdf09833eec02f29b95542b18
Author: Takahito Hirose <takahito.hirose0518@gmail.com>
Date:   Tue Mar 26 19:52:33 2019 +0900

Fix DBDeadLock error resulting into 500
    
    When user requests the record registration request continuously,
    sometimes designate hits DBDeadLock resuting into 500 InternalServerError.
    
    We get below error:
    
    2019-02-21 21:30:39.925 49752 ERROR designate.api.middleware RemoteError:
    Remote error: DBDeadlock (pymysql.err.InternalError)
    (1213, u'Deadlock found when trying to get lock; try restarting transaction')
    [SQL: u'UPDATE records SET version=(records.version + %(version_1)s),
    updated_at=%(updated_at)s, data=%(data)s, hash=%(hash)s, status=%(status)s,
    action=%(action)s, serial=%(serial)s WHERE records.id = %(id_1)s']
    [parameters: {'status': 'PENDING', 'hash': '39795ee18c6e3c9ad1c0190c6a3d8d4f',
    'updated_at': datetime.datetime(2019, 2, 21, 12, 30, 39, 909846), u'version_1': 1,
    u'id_1': '7a655eeda4d446cdaa81caf19ab55fcc', 'action': 'UPDATE',
    'serial': 1550752338,
    'data': u'ns2.example.jp. domain.example.com. 1550752338 3552 600 86400 3600'}]
    
    In the process of record registeration, designate first tried to update
    the reocrd and then update the zone status.
    
    Updating the zone_status and registering the record process[1] and after synced
    update record_status and zone_status process[2] are in reverse order. So If user
    request the registering record many time and same time, Designate will get the
    DBDeadLock, when these processes run the same time.
    
    We observed that changing the order of the operations solves this issue.
    
    [1] https://github.com/openstack/designate/blob/master/designate/central/service.py#L1292-L1320
    [2] https://github.com/openstack/designate/blob/master/designate/central/service.py#L2310-L2322
    
    1. transaction [1]-1 updating zone status process <- run    ---> table_name-zone
    2. transaction [2]-1 updating record status process <- run  ---> table_name-record
    3. transaction [1]-2 registering record process <- run and wait  ---> table_name-record
    4. transaction [2]-2 updating zone process <-deadlock!           ---> table_name-zone
    
    Change-Id: Icd6e690ac84a2fe0db0f4a8a513de47f7916f5ea
    Related-Bug: #1785459
    (cherry picked from commit f828654a3d40476cac7eb24a09a36e9978c2d708)

tags:

added: in-stable-queens

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2021-06-09: Change abandoned on designate (stable/ocata)

#10

Change abandoned by "Michael Johnson <email address hidden>" on branch: stable/ocata
Review: https://review.opendev.org/c/openstack/designate/+/650607
Reason: Abandoning this patch in preparation for stable/ocata end of life. This has not had an update for two years.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2021-12-24: Change abandoned on designate (stable/pike)

#11

Change abandoned by "Erik Olof Gunnar Andersson <email address hidden>" on branch: stable/pike
Review: https://review.opendev.org/c/openstack/designate/+/650606

Designate

PostgreSQL database deadlocks under high load

Bug Description

Other bug subscribers

Remote bug watches