[SRU] service_statuses table running full in Designate database

Bug #1768824 reported by Dmitry Galkin on 2018-05-03
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Designate
High
Erik Olof Gunnar Andersson
Ubuntu Cloud Archive
Undecided
Unassigned
Queens
High
Unassigned
designate (Ubuntu)
Undecided
Unassigned
Bionic
High
Unassigned

Bug Description

[Impact]
This patch is required to prevent pool-manager from creating unbounded amounts of status logs in the service_statuses table triggered by having > 1 log in there.

[Test Case]
* deploy openstack queens with designate
* mysql> select count(*) from service_statuses where service_name="pool_manager"; should return 1
* try to add an extra entry:

ts=`date '+%Y-%m-%d %H:%M:%S'`
svc_host=`mysql -B -h$host -u${service} -p$passwd ${service} -e 'select hostname from service_statuses where service_name="pool_manager";'| tail -n 1`
mysql -h$host -u${service} -p$passwd ${service} -e "insert into service_statuses values ('1234', '$ts', '$ts', 'pool_manager', '$svc_host', '$ts', 'UP', '{}', '{}');"

* this should fail since the hostname/servicename columns should now be a unique contraint
* can also check this with:

mysql -h$host -u${service} -p$passwd ${service} -e "select * from INFORMATION_SCHEMA.TABLE_CONSTRAINTS where CONSTRAINT_TYPE='UNIQUE';"| grep service_statuses

[Regression Potential]
if the table already has multiple records for pool_manager in the service_statuses table, it will be necessary to (manually) delete all but one record in order for the upgrade to succeed.
----------------------------------------------------------------------------

Hi,

The service_statuses table in Designate database is running full of records in our deployment:

MariaDB [designate]> select count(*) from service_statuses;

+----------+
| count(*) |
+----------+
| 24474342 |
+----------+
1 row in set (7 min 19.09 sec)

We got millions of rows in just couple of month. The problem is that the same services running on the same hosts create new record (instead of updating existing) during status report to Designate.
This is how it looks in DB:

MariaDB [designate]> select * from service_statuses;
+----------------------------------+---------------------+---------------------+--------------+------------------------------------+---------------------+--------+-------+--------------+
| id | created_at | updated_at | service_name | hostname | heartbeated_at | status | stats | capabilities |
+----------------------------------+---------------------+---------------------+--------------+------------------------------------+---------------------+--------+-------+--------------+
| 0dde2b5f228549d5995cb0338841bd50 | 2018-05-02 12:06:03 | NULL | producer | designate-producer-855855776-cr8d9 | 2018-05-02 12:06:03 | UP | {} | {} |
| 0e311d3000d8403d97066eba619490a3 | 2018-05-02 12:05:14 | NULL | api | designate-api-2042646259-6090v | 2018-05-02 12:05:13 | UP | {} | {} |
| 168448cd97cd428ea19318243570482c | 2018-05-02 12:05:48 | NULL | producer | designate-producer-855855776-cr8d9 | 2018-05-02 12:05:48 | UP | {} | {} |
| 1685d7f80d8c4f75b052680e5e2f40ae | 2018-05-02 12:05:59 | NULL | api | designate-api-2042646259-6090v | 2018-05-02 12:05:58 | UP | {} | {} |
| 192275eb33854b4091b981b0c32d04f7 | 2018-05-02 12:05:41 | NULL | worker | designate-worker-3446544-7fzqx | 2018-05-02 12:05:35 | UP | {} | {} |
| 1e465011f21f47f096b54005675e8011 | 2018-05-02 12:05:25 | NULL | mdns | designate-mdns-4198843580-lw6s2 | 2018-05-02 12:05:25 | UP | {} | {} |
| 22e0ab87b3cd4228bc191e49923d13ba | 2018-05-02 12:05:58 | NULL | producer | designate-producer-855855776-cr8d9 | 2018-05-02 12:05:58 | UP | {} | {} |
| 284d6b6bf53f47358fa14750d0c2a181 | 2018-05-02 12:05:18 | NULL | producer | designate-producer-855855776-cr8d9 | 2018-05-02 12:05:18 | UP | {} | {} |
| 3617ac740b9945c09f65f4b8cf0a72b5 | 2018-05-02 12:05:25 | NULL | api | designate-api-2042646259-drzws | 2018-05-02 12:05:25 | UP | {} | {} |
| 3880157829f946398159f104b13a066b | 2018-05-02 12:06:00 | NULL | mdns | designate-mdns-4198843580-lw6s2 | 2018-05-02 12:06:00 | UP | {} | {} |

The respective part of the code is:
https://github.com/openstack/designate/blob/master/designate/central/service.py#L2879-L2898

It is not exactly clear to me why the id is needed to identify the service. The 'hostname' and 'service_name' seem to be enough - it is rather unlikely that there are two api services are running on the same host..

So, I've removed the id part:

        if service_status.obj_attr_is_set('id'):
            criterion["id"] = service_status.id

and cleaned up the service_statuses table, now it looks fine:

MariaDB [designate]> select * from service_statuses;
+----------------------------------+---------------------+---------------------+--------------+-------------------------------------+---------------------+--------+-------+--------------+
| id | created_at | updated_at | service_name | hostname | heartbeated_at | status | stats | capabilities |
+----------------------------------+---------------------+---------------------+--------------+-------------------------------------+---------------------+--------+-------+--------------+
| 1e89d10b0ddb459f8a9522ebd62fa629 | 2018-05-02 13:03:21 | 2018-05-02 13:03:37 | worker | designate-worker-2851226431-9274c | 2018-05-02 13:03:36 | UP | {} | {} |
| 8d29d50270ee48999b78e90c1b88acf8 | 2018-05-02 13:03:23 | 2018-05-02 13:03:38 | api | designate-api-1699569558-9rvg6 | 2018-05-02 13:03:38 | UP | {} | {} |
| 9602ccf55d204c7b86277e21af3d739f | 2018-05-02 13:03:18 | 2018-05-02 13:03:38 | producer | designate-producer-3892623035-6vwt0 | 2018-05-02 13:03:38 | UP | {} | {} |
| cec705e082c542cdbfa83b9ed0c2b438 | 2018-05-02 13:03:19 | 2018-05-02 13:03:39 | api | designate-api-1699569558-r018m | 2018-05-02 13:03:39 | UP | {} | {} |
| ddd48f13b67448d2949ced39d76e2958 | 2018-05-02 13:03:20 | 2018-05-02 13:03:40 | central | designate-central-2632925199-wwj7j | 2018-05-02 13:03:40 | UP | {} | {} |
| ef0ae921a78d4ac090dc74a1dc686921 | 2018-05-02 13:03:21 | 2018-05-02 13:03:41 | mdns | designate-mdns-4162010419-pvclv | 2018-05-02 13:03:41 | UP | {} | {} |
+----------------------------------+---------------------+---------------------+--------------+-------------------------------------+---------------------+--------+-------+--------------+
6 rows in set (0.00 sec)

Our Designate version is Pike, but the code seems to be the same in master branch.

Dmitry Galkin (galkindmitrii) wrote :

Here are the examples of service_statuses table in a more readable way:

http://paste.openstack.org/show/720282/

The issue here is that the hostname and service_name under service_statuses are not unique. If for any reason a duplicate entry for a service gets created, the call to update will always fail, as it's excepting only a single database entry.

https://github.com/openstack/designate/blob/fb5c1a97797c66292151549b82519a682b6300b6/designate/storage/impl_sqlalchemy/migrate_repo/versions/097_add_services.py
https://github.com/openstack/designate/blob/fb5c1a97797c66292151549b82519a682b6300b6/designate/storage/impl_sqlalchemy/__init__.py#L1771

We need to update the schema to only allow a unique entry per hostname/service_name.

Changed in designate:
status: New → Confirmed
importance: Undecided → High
Changed in designate:
assignee: nobody → Erik Olof Gunnar Andersson (eandersson)
status: Confirmed → In Progress

Reviewed: https://review.openstack.org/568372
Committed: https://git.openstack.org/cgit/openstack/designate/commit/?id=1924abff40341337bd00659ff01a14ea91701475
Submitter: Zuul
Branch: master

commit 1924abff40341337bd00659ff01a14ea91701475
Author: Erik Olof Gunnar Andersson <email address hidden>
Date: Mon May 14 13:50:03 2018 -0700

    Add unique constraint to service_statuses

    In the current implementation, if for any reason a duplicate
    service entry gets created, the call to update that service
    will fail endlessly, and instead cause the service to create
    new entries everytime update_service_status gets called. Causing
    it to fill the database with duplicate entries.

    This patch adds a unique constraint to the service_statuses
    table based on the service_name and hostname, to ensure that
    this cannot happen.

    In addition we add a new test to the storage driver and further
    expanded the central service test coverage.

    Change-Id: I307a8f7dd8b8a83effa447a846db3288efa32dba
    Closes-Bug: #1768824

Changed in designate:
status: In Progress → Fix Released

This issue was fixed in the openstack/designate 7.0.0.0b3 development milestone.

Reviewed: https://review.openstack.org/604430
Committed: https://git.openstack.org/cgit/openstack/designate/commit/?id=1842802de4a05a057eb7f3c94bd01a4b1f0b9143
Submitter: Zuul
Branch: master

commit 1842802de4a05a057eb7f3c94bd01a4b1f0b9143
Author: Ben Nemec <email address hidden>
Date: Wed Sep 19 20:34:26 2018 +0000

    Add designate-status command for upgrade checks

    This checks for duplicate entries in the service_statuses table
    which will be an error after change
    I307a8f7dd8b8a83effa447a846db3288efa32dba.

    Related-Bug: 1768824

    Story: 2003657
    Task: 26127

    Change-Id: Ie0350b034f0eb03749138aadd0951d30073214c0
    Co-authored-by: Doug Hellmann <email address hidden>

description: updated
summary: - service_statuses table running full in Designate database
+ [SRU] service_statuses table running full in Designate database
Changed in cloud-archive:
status: New → Fix Released
Edward Hope-Morley (hopem) wrote :
Edward Hope-Morley (hopem) wrote :

Above desbdiff is for the "Add unique constraint to service_statuses". We also want to SRU the designate-manage patch (same package) but that needs to be backported to rocky first (it landed in Stein upstream).

Changed in designate (Ubuntu):
status: New → Fix Released
Corey Bryant (corey.bryant) wrote :

Thanks Ed. Just waiting on the rocky fix to land upstream so that we can upload the rocky changes for designate-status before we upload queens.

tags: added: sts sts-sru-needed
Changed in designate (Ubuntu Bionic):
status: New → Triaged
importance: Undecided → High
Corey Bryant (corey.bryant) wrote :

Thanks Edward. This has been uploaded to the bionic unapproved queue: https://launchpad.net/ubuntu/bionic/+queue?queue_state=1&queue_text=designate

Hello Dmitry, or anyone else affected,

Accepted designate into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/designate/1:6.0.1-0ubuntu1.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in designate (Ubuntu Bionic):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-bionic
Edward Hope-Morley (hopem) wrote :

Bionic proposed package failed to build because I forgot to remove a dependency from the unit tests in that patch prior to attaching the debdiff. Apologies for that. I have tested a new patch with this dependency removed and will attach it now. Please go ahead and replace the current bionic-proposed build with this one. Thanks.

Edward Hope-Morley (hopem) wrote :
tags: added: verification-failed verification-failed-bionic
removed: verification-needed verification-needed-bionic
Łukasz Zemczak (sil2100) wrote :

Hello Dmitry, or anyone else affected,

Accepted designate into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/designate/1:6.0.1-0ubuntu1.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

tags: added: verification-needed verification-needed-bionic
removed: verification-failed verification-failed-bionic
Corey Bryant (corey.bryant) wrote :

Hello Dmitry, or anyone else affected,

Accepted designate into queens-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:queens-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-queens-needed to verification-queens-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-queens-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-queens-needed
Edward Hope-Morley (hopem) wrote :

Bionic Queen verified using [Test Case]

Test Output:

root@juju-01d5ff-lp1768824-sru-5:~# dpkg -l| grep designate
ii designate-agent 1:6.0.1-0ubuntu1.2 all OpenStack DNS as a Service - agent
ii designate-api 1:6.0.1-0ubuntu1.2 all OpenStack DNS as a Service - API server
ii designate-central 1:6.0.1-0ubuntu1.2 all OpenStack DNS as a Service - central daemon
ii designate-common 1:6.0.1-0ubuntu1.2 all OpenStack DNS as a Service - common files
ii designate-mdns 1:6.0.1-0ubuntu1.2 all OpenStack DNS as a Service - mdns
ii designate-pool-manager 1:6.0.1-0ubuntu1.2 all OpenStack DNS as a Service - pool manager
ii designate-sink 1:6.0.1-0ubuntu1.2 all OpenStack DNS as a Service - sink
ii designate-zone-manager 1:6.0.1-0ubuntu1.2 all OpenStack DNS as a Service - zone manager
ii python-designate 1:6.0.1-0ubuntu1.2 all OpenStack DNS as a Service - Python libs
ii python-designateclient 2.9.0-0ubuntu1 all client library for the OpenStack Designate API - Python 2.7
root@juju-01d5ff-lp1768824-sru-5:~# mysql -h$host -u${service} -p$passwd ${service} -e'select count(*) from service_statuses where service_name="pool_manager"'
mysql: [Warning] Using a password on the command line interface can be insecure.
+----------+
| count(*) |
+----------+
| 1 |
+----------+
root@juju-01d5ff-lp1768824-sru-5:~# ts=`date '+%Y-%m-%d %H:%M:%S'`
root@juju-01d5ff-lp1768824-sru-5:~# svc_host=`mysql -B -h$host -u${service} -p$passwd ${service} -e 'select hostname from service_statuses where service_name="pool_manager";'| tail -n 1`
mysql: [Warning] Using a password on the command line interface can be insecure.
root@juju-01d5ff-lp1768824-sru-5:~# mysql -h$host -u${service} -p$passwd ${service} -e "insert into service_statuses values ('1234', '$ts', '$ts', 'pool_manager', '$svc_host', '$ts', 'UP', '{}', '{}');"
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1062 (23000) at line 1: Duplicate entry 'pool_manager-juju-01d5ff-lp1768824-sru-5' for key 'unique_service_status'
root@juju-01d5ff-lp1768824-sru-5:~# mysql -h$host -u${service} -p$passwd ${service} -e "select * from INFORMATION_SCHEMA.TABLE_CONSTRAINTS where CONSTRAINT_TYPE='UNIQUE';"| grep service_statuses
mysql: [Warning] Using a password on the command line interface can be insecure.
def designate unique_service_status designate service_statuses UNIQUE

tags: added: verification-done-bionic
removed: verification-needed-bionic

The verification of the Stable Release Update for designate has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package designate - 1:6.0.1-0ubuntu1.2

---------------
designate (1:6.0.1-0ubuntu1.2) bionic; urgency=medium

  [ Edward Hope-Morley ]
  * d/p/add-unique-constraint-to-service_statuses.patch:
    Remove use of tests.fixtures.StandardLogging since it is
    not supported in stable/queens.

designate (1:6.0.1-0ubuntu1.1) bionic; urgency=medium

  * Add patch to create service_statuses uniq constraint (LP: #1768824)
    d/p/add-unique-constraint-to-service_statuses.patch

 -- Corey Bryant <email address hidden> Fri, 12 Jul 2019 08:53:10 -0400

Changed in designate (Ubuntu Bionic):
status: Fix Committed → Fix Released
Edward Hope-Morley (hopem) wrote :

Xenial Queens verifie using [Test Case]

Test Output:

root@juju-c01a91-lp1768824-sru-1:~# mysql -h$host -u${service} -p$passwd ${service} -e'select count(*) from service_statuses where service_name="pool_manager";'
+----------+
| count(*) |
+----------+
| 1 |
+----------+
root@juju-c01a91-lp1768824-sru-1:~# ts=`date '+%Y-%m-%d %H:%M:%S'`
root@juju-c01a91-lp1768824-sru-1:~# svc_host=`mysql -B -h$host -u${service} -p$passwd ${service} -e 'select hostname from service_statuses where service_name="pool_manager";'| tail -n 1`
root@juju-c01a91-lp1768824-sru-1:~# mysql -h$host -u${service} -p$passwd ${service} -e "insert into service_statuses values ('1234', '$ts', '$ts', 'pool_manager', '$svc_host', '$ts', 'UP', '{}', '{}');"
ERROR 1062 (23000) at line 1: Duplicate entry 'pool_manager-juju-c01a91-lp1768824-sru-1' for key 'unique_service_status'
root@juju-c01a91-lp1768824-sru-1:~# mysql -h$host -u${service} -p$passwd ${service} -e "select * from INFORMATION_SCHEMA.TABLE_CONSTRAINTS where CONSTRAINT_TYPE='UNIQUE';"| grep service_statuses
def designate unique_service_status designate service_statuses UNIQUE

description: updated
tags: added: sts-sru-done verification-done verification-queens-done
removed: sts-sru-needed verification-needed verification-queens-needed
Corey Bryant (corey.bryant) wrote :

The verification of the Stable Release Update for designate has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Corey Bryant (corey.bryant) wrote :

This bug was fixed in the package designate - 1:6.0.1-0ubuntu1.2~cloud0
---------------

 designate (1:6.0.1-0ubuntu1.2~cloud0) xenial-queens; urgency=medium
 .
   * New update for the Ubuntu Cloud Archive.
 .
 designate (1:6.0.1-0ubuntu1.2) bionic; urgency=medium
 .
   [ Edward Hope-Morley ]
   * d/p/add-unique-constraint-to-service_statuses.patch:
     Remove use of tests.fixtures.StandardLogging since it is
     not supported in stable/queens.
 .
 designate (1:6.0.1-0ubuntu1.1) bionic; urgency=medium
 .
   * Add patch to create service_statuses uniq constraint (LP: #1768824)
     d/p/add-unique-constraint-to-service_statuses.patch

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers