Improper close connection to database leading to mysql/mariadb block connection.

Bug #1831018 reported by puthi
22
This bug affects 5 people
Affects Status Importance Assigned to Milestone
OpenStack Identity (keystone)
Invalid
Undecided
Unassigned
oslo.db
New
Undecided
Unassigned

Bug Description

Version
=======
Neutron-server: openstack-neutron-13.0.2-1.el7.noarch
Nova: openstack-nova-*-18.2.0-1.el7.noarch
keystone: openstack-keystone-14.0.1-2.el7.noarch
Mariadb: mariadb-server-10.1.20-2.el7.x86_64

Openstack setup:
================
HAproxy => 3 Controllers (nova,neutron,keystone) => Mariadb

Openstack config
================
connection_recycle_time = 3600 (default)
the rest of database connection are default.

Mariadb config
==============
interactive_timeout: 28800
wait_timeout: 28800

Error
=====
As my Openstack cluster grow, more and more people start using the cluster, i start seeing this error everyday now

2019-05-30 10:42:15.695 44938 ERROR oslo_db.sqlalchemy.exc_filters [req-b6fd59b9-8378-49df-bbf6-de9f9b741490 - - - - -] DBAPIError exception wrapped from (pymysql.err.InternalError) (1129, u"Host 'xx.xx.xx.xx' is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'") (Background on this error at: http://sqlalche.me/e/2j85): InternalError: (1129, u"Host 'xx.xx.xx.xx' is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'")

This is not necessary happens to only neutron but all of the components of openstack. And when i turn log_warnings=4 in Mariadb, I can see in the log of Mariadb as below:

2019-05-27 10:22:04 140078484511488 [Warning] Aborted connection 70834104 to db: 'nova' user: 'nova' host: 'controller3' (CLOSE_CONNECTION)
2019-05-27 10:22:05 140084673243904 [Warning] Aborted connection 70834111 to db: 'nova' user: 'nova' host: 'controller3' (CLOSE_CONNECTION)
2019-05-27 10:22:07 140078500485888 [Warning] Aborted connection 70834211 to db: 'nova' user: 'nova' host: 'controller3' (CLOSE_CONNECTION)
2019-05-27 10:22:07 140078490655488 [Warning] Aborted connection 70834157 to db: 'nova' user: 'nova' host: 'controller3' (CLOSE_CONNECTION)
2019-05-27 10:22:09 140078698322688 [Warning] Aborted connection 70834327 to db: 'nova' user: 'nova' host: 'controller3' (CLOSE_CONNECTION)
2019-05-27 10:22:12 140078715833088 [Warning] Aborted connection 70894166 to db: 'unconnected' user: 'neutron' host: 'controller2' (CLOSE_CONNECTION)
2019-05-27 10:22:13 140078737951488 [Warning] Aborted connection 70834380 to db: 'nova' user: 'nova' host: 'controller3' (CLOSE_CONNECTION)
2019-05-27 10:22:17 140078641797888 [Warning] Aborted connection 70834382 to db: 'nova' user: 'nova' host: 'controller3' (CLOSE_CONNECTION)
2019-05-27 10:22:21 140078581893888 [Warning] Aborted connection 70834436 to db: 'nova' user: 'nova' host: 'controller3' (CLOSE_CONNECTION)
2019-05-27 10:22:22 140078724434688 [Warning] Aborted connection 70834469 to db: 'nova' user: 'nova' host: 'controller3' (CLOSE_CONNECTION)
2019-05-27 10:22:28 140078715833088 [Warning] Aborted connection 70894174 to db: 'unconnected' user: 'unauthenticated' host: 'controller2' (CLOSE_CONNECTION)
2019-05-27 10:22:29 140078715833088 [Warning] Aborted connection 70894177 to db: 'neutron' user: 'neutron' host: 'controller2' (CLOSE_CONNECTION)
...
2019-05-30 7:35:28 140078596025088 [Warning] Aborted connection 72547571 to db: 'nova' user: 'nova' host: 'controller1' (Got an error reading communication packets)
2019-05-30 7:46:54 140078541036288 [Warning] Aborted connection 72552087 to db: 'nova_cell0' user: 'nova' host: 'controller1' (Got an error reading communication packets)
2019-05-30 7:46:57 140078799182592 [Warning] Aborted connection 72552086 to db: 'nova' user: 'nova' host: 'controller1' (Got an error reading communication packets)
2019-05-30 7:47:02 140078738565888 [Warning] Aborted connection 72534613 to db: 'nova_cell0' user: 'nova' host: 'controller1' (Got an error reading communication packets)
2019-05-30 8:31:11 140078638418688 [Warning] Aborted connection 72419897 to db: 'nova' user: 'nova' host: 'controller3' (Got timeout reading communication packets)
2019-05-30 8:36:22 140078791195392 [Warning] Aborted connection 72421900 to db: 'nova' user: 'nova' host: 'controller2' (Got timeout reading communication packets)
2019-05-30 8:46:23 140078624594688 [Warning] Aborted connection 72577413 to db: 'nova_cell0' user: 'nova' host: 'controller1' (Got an error reading communication packets)
2019-05-30 8:46:26 140078716447488 [Warning] Aborted connection 72577414 to db: 'nova' user: 'nova' host: 'controller1' (Got an error reading communication packets)
2019-05-30 10:45:23 140078661151488 [Warning] Aborted connection 72675103 to db: 'neutron' user: 'neutron' host: 'controller3' (Got an error reading communication packets)
2019-05-30 10:45:23 140078672517888 [Warning] Aborted connection 72675137 to db: 'neutron' user: 'neutron' host: 'controller3' (Got an error reading communication packets)
2019-05-30 10:45:23 140078768155392 [Warning] Aborted connection 72674638 to db: 'neutron' user: 'neutron' host: 'controller3' (Got an error reading communication packets)
2019-05-30 10:45:23 140078647327488 [Warning] Aborted connection 72674581 to db: 'neutron' user: 'neutron' host: 'controller3' (Got an error reading communication packets)

I also notice that every times i restart any services (nova/neutron/keystone) i can see that warning "Got an error reading communication packets" appear in Mariadb log. And i think when the connection is properly closed, the Mariadb log will give "CLOSE_CONNECTION" instead of that error.
So, i wonder if this is a bug in Openstack Rocky version as i don't see this behavior in other version.

I'm not sure this is the right place but i also report the bug here (https://bugs.launchpad.net/neutron/+bug/1831009)

Revision history for this message
Manuel Torrinha (t0rrant) wrote :

I can also see this behaviour.

It seems to happen not only on service restarts, but when performing `db sync` operations for various services:

  - keystone
  - nova
  - neutron
  - cinder
  - glance

When performing these `db sync` operations the message is as follows:

```
191202 16:24:14 [Warning] Aborted connection 7509 to db: 'keystone' user: 'keystone' host: 'localhost' (Got an error reading communication packets)
```

It's similar for all other services.

Revision history for this message
Colleen Murphy (krinkle) wrote :

Marking as invalid for keystone, this is a problem in oslo.db or sqlalchemy.

Changed in keystone:
status: New → Invalid
Revision history for this message
Mike Bayer (zzzeek) wrote :

hey there -

oslo.db has no special thing it does with connections here and there's been no change to how it connects or anything like that. these look like network errors. Are there any stack traces to be shared from application logs? To understand anything occurring we would need to see these errors from the perspective of the pymysql client library.

Revision history for this message
Mike Bayer (zzzeek) wrote :

also in this case, as this is for generic openstack, what's the deployment pattern here:

1. tripleo in use?

2. haproxy in use?

3. galera in use?

4. other deployment thing in use ?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.