Comment 0 for bug 1768824

Revision history for this message
Dmitry Galkin (galkindmitrii) wrote : service_statuses table running full in Designate database

Hi,

The service_statuses table in Designate database is running full of records in our deployment:

MariaDB [designate]> select count(*) from service_statuses;

+----------+
| count(*) |
+----------+
| 24474342 |
+----------+
1 row in set (7 min 19.09 sec)

We got millions of rows in just couple of month. The problem is that the same services running on the same hosts create new record (instead of updating existing) during status report to Designate.
This is how it looks in DB:

MariaDB [designate]> select * from service_statuses;
+----------------------------------+---------------------+---------------------+--------------+------------------------------------+---------------------+--------+-------+--------------+
| id | created_at | updated_at | service_name | hostname | heartbeated_at | status | stats | capabilities |
+----------------------------------+---------------------+---------------------+--------------+------------------------------------+---------------------+--------+-------+--------------+
| 0dde2b5f228549d5995cb0338841bd50 | 2018-05-02 12:06:03 | NULL | producer | designate-producer-855855776-cr8d9 | 2018-05-02 12:06:03 | UP | {} | {} |
| 0e311d3000d8403d97066eba619490a3 | 2018-05-02 12:05:14 | NULL | api | designate-api-2042646259-6090v | 2018-05-02 12:05:13 | UP | {} | {} |
| 168448cd97cd428ea19318243570482c | 2018-05-02 12:05:48 | NULL | producer | designate-producer-855855776-cr8d9 | 2018-05-02 12:05:48 | UP | {} | {} |
| 1685d7f80d8c4f75b052680e5e2f40ae | 2018-05-02 12:05:59 | NULL | api | designate-api-2042646259-6090v | 2018-05-02 12:05:58 | UP | {} | {} |
| 192275eb33854b4091b981b0c32d04f7 | 2018-05-02 12:05:41 | NULL | worker | designate-worker-3446544-7fzqx | 2018-05-02 12:05:35 | UP | {} | {} |
| 1e465011f21f47f096b54005675e8011 | 2018-05-02 12:05:25 | NULL | mdns | designate-mdns-4198843580-lw6s2 | 2018-05-02 12:05:25 | UP | {} | {} |
| 22e0ab87b3cd4228bc191e49923d13ba | 2018-05-02 12:05:58 | NULL | producer | designate-producer-855855776-cr8d9 | 2018-05-02 12:05:58 | UP | {} | {} |
| 284d6b6bf53f47358fa14750d0c2a181 | 2018-05-02 12:05:18 | NULL | producer | designate-producer-855855776-cr8d9 | 2018-05-02 12:05:18 | UP | {} | {} |
| 3617ac740b9945c09f65f4b8cf0a72b5 | 2018-05-02 12:05:25 | NULL | api | designate-api-2042646259-drzws | 2018-05-02 12:05:25 | UP | {} | {} |
| 3880157829f946398159f104b13a066b | 2018-05-02 12:06:00 | NULL | mdns | designate-mdns-4198843580-lw6s2 | 2018-05-02 12:06:00 | UP | {} | {} |

The respective part of the code is:
https://github.com/openstack/designate/blob/master/designate/central/service.py#L2879-L2898

It is not exactly clear to me why the id is needed to identify the service. The 'hostname' and 'service_name' seem to be enough - it is rather unlikely that there are two api services are running on the same host..

So, I've removed the id part:

        if service_status.obj_attr_is_set('id'):
            criterion["id"] = service_status.id

and cleaned up the service_statuses table, now it looks fine:

MariaDB [designate]> select * from service_statuses;
+----------------------------------+---------------------+---------------------+--------------+-------------------------------------+---------------------+--------+-------+--------------+
| id | created_at | updated_at | service_name | hostname | heartbeated_at | status | stats | capabilities |
+----------------------------------+---------------------+---------------------+--------------+-------------------------------------+---------------------+--------+-------+--------------+
| 1e89d10b0ddb459f8a9522ebd62fa629 | 2018-05-02 13:03:21 | 2018-05-02 13:03:37 | worker | designate-worker-2851226431-9274c | 2018-05-02 13:03:36 | UP | {} | {} |
| 8d29d50270ee48999b78e90c1b88acf8 | 2018-05-02 13:03:23 | 2018-05-02 13:03:38 | api | designate-api-1699569558-9rvg6 | 2018-05-02 13:03:38 | UP | {} | {} |
| 9602ccf55d204c7b86277e21af3d739f | 2018-05-02 13:03:18 | 2018-05-02 13:03:38 | producer | designate-producer-3892623035-6vwt0 | 2018-05-02 13:03:38 | UP | {} | {} |
| cec705e082c542cdbfa83b9ed0c2b438 | 2018-05-02 13:03:19 | 2018-05-02 13:03:39 | api | designate-api-1699569558-r018m | 2018-05-02 13:03:39 | UP | {} | {} |
| ddd48f13b67448d2949ced39d76e2958 | 2018-05-02 13:03:20 | 2018-05-02 13:03:40 | central | designate-central-2632925199-wwj7j | 2018-05-02 13:03:40 | UP | {} | {} |
| ef0ae921a78d4ac090dc74a1dc686921 | 2018-05-02 13:03:21 | 2018-05-02 13:03:41 | mdns | designate-mdns-4162010419-pvclv | 2018-05-02 13:03:41 | UP | {} | {} |
+----------------------------------+---------------------+---------------------+--------------+-------------------------------------+---------------------+--------+-------+--------------+
6 rows in set (0.00 sec)

Our Designate version is Pike, but the code seems to be the same in master branch.