Exceeding max connections causes update-status hook error, which can block mojo deploys

Bug #1811151 reported by Casey Marshall
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
PostgreSQL Charm
Triaged
Medium
Unassigned

Bug Description

Ran into a weird situation earlier today where a problem in our deployment caused the max allowed database connections to be exceeded. The cause was the rate limit on our app's charm was not configured, which would normally protect the database from getting thrashed by a spike in traffic. (Because this traffic is non-interactive messaging, that's ok)

So we tried to roll out a mojo update to address the issue, but couldn't, because the postgresql charm was in an error state. It seemed to be in an error state because of the connection issue:

2019-01-09 15:02:56 DEBUG update-status Traceback (most recent call last):
2019-01-09 15:02:56 DEBUG update-status File "/var/lib/juju/agents/unit-postgresql-0/charm/hooks/update-status", line 19, in <module>
2019-01-09 15:02:56 DEBUG update-status main()
2019-01-09 15:02:56 DEBUG update-status File "/usr/local/lib/python3.5/dist-packages/charms/reactive/__init__.py", line 78, in main
2019-01-09 15:02:56 DEBUG update-status bus.dispatch()
2019-01-09 15:02:56 DEBUG update-status File "/usr/local/lib/python3.5/dist-packages/charms/reactive/bus.py", line 423, in dispatch
2019-01-09 15:02:56 DEBUG update-status _invoke(other_handlers)
2019-01-09 15:02:56 DEBUG update-status File "/usr/local/lib/python3.5/dist-packages/charms/reactive/bus.py", line 406, in _invoke
2019-01-09 15:02:56 DEBUG update-status handler.invoke()
2019-01-09 15:02:56 DEBUG update-status File "/usr/local/lib/python3.5/dist-packages/charms/reactive/bus.py", line 280, in invoke
2019-01-09 15:02:56 DEBUG update-status self._action(*args)
2019-01-09 15:02:56 DEBUG update-status File "/var/lib/juju/agents/unit-postgresql-0/charm/reactive/postgresql/client.py", line 60, in master_provides
2019-01-09 15:02:56 DEBUG update-status ensure_db_relation_resources(rel)
2019-01-09 15:02:56 DEBUG update-status File "/usr/local/lib/python3.5/dist-packages/charms/reactive/decorators.py", line 203, in _wrapped
2019-01-09 15:02:56 DEBUG update-status return func(*args, **kwargs)
2019-01-09 15:02:56 DEBUG update-status File "/var/lib/juju/agents/unit-postgresql-0/charm/reactive/postgresql/client.py", line 253, in ensure_db_relation_resources
2019-01-09 15:02:56 DEBUG update-status con = postgresql.connect(database=master['database'])
2019-01-09 15:02:56 DEBUG update-status File "/var/lib/juju/agents/unit-postgresql-0/charm/reactive/postgresql/postgresql.py", line 116, in connect
2019-01-09 15:02:56 DEBUG update-status host=host, port=port_)
2019-01-09 15:02:56 DEBUG update-status File "/usr/lib/python3/dist-packages/psycopg2/__init__.py", line 164, in connect
2019-01-09 15:02:56 DEBUG update-status conn = _connect(dsn, connection_factory=connection_factory, async=async)
2019-01-09 15:02:56 DEBUG update-status psycopg2.OperationalError: FATAL: sorry, too many clients already

We were able to work around this by temporarily downing the app server.

Casey Marshall (cmars)
summary: - Exceeding max connections causes update-status hook to fail, which can
+ Exceeding max connections causes update-status hook error, which can
block mojo deploys
Revision history for this message
Stuart Bishop (stub) wrote :

If regular DB connections are being used, the default setting of superuser_reserved_connections=3 should avoid this.

If superuser connections are being used, we can't avoid running out of slots.

Need to investigate if running out of slots can be handled better.

Per user connection limits might help, where the clients are using the provided credentials.

Revision history for this message
Stuart Bishop (stub) wrote :

General recommendation for apps that behave this way (open a connection per request) is to place pgbouncer between the client and PostgreSQL. This greatly improves performance under any sort of load.

Changed in postgresql-charm:
status: New → Triaged
importance: Undecided → Critical
importance: Critical → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.