db-relation-changed hook fails, connection rejected for database postgress,SSL on

Bug #1850171 reported by Alexander Balderson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Landscape Charm
New
Undecided
Unassigned

Bug Description

It looks like the charm moved from revision 32 to 35, and since we have landscape-server failing to deploy on standard solutions-qa stable test deployments with the error:

2019-10-27 07:13:53 DEBUG db-relation-changed Traceback (most recent call last):
2019-10-27 07:13:53 DEBUG db-relation-changed File "/usr/bin/landscape-schema", line 12, in <module>
2019-10-27 07:13:53 DEBUG db-relation-changed canonical.landscape.scripts.schema.run()
2019-10-27 07:13:53 DEBUG db-relation-changed File "/opt/canonical/landscape/canonical/landscape/scripts/schema.py", line 281, in run
2019-10-27 07:13:53 DEBUG db-relation-changed _set_two_phase_commit_if_available(zstorm)
2019-10-27 07:13:53 DEBUG db-relation-changed File "/opt/canonical/landscape/canonical/landscape/scripts/schema.py", line 532, in _set_two_phase_commit_if_available
2019-10-27 07:13:53 DEBUG db-relation-changed store = zstorm.get(name)
2019-10-27 07:13:53 DEBUG db-relation-changed File "/usr/lib/python2.7/dist-packages/storm/zope/zstorm.py", line 177, in get
2019-10-27 07:13:53 DEBUG db-relation-changed return self.create(name, default_uri)
2019-10-27 07:13:53 DEBUG db-relation-changed File "/usr/lib/python2.7/dist-packages/storm/zope/zstorm.py", line 153, in create
2019-10-27 07:13:53 DEBUG db-relation-changed store = Store(database)
2019-10-27 07:13:53 DEBUG db-relation-changed File "/usr/lib/python2.7/dist-packages/storm/store.py", line 74, in __init__
2019-10-27 07:13:53 DEBUG db-relation-changed self._connection = database.connect(self._event)
2019-10-27 07:13:53 DEBUG db-relation-changed File "/usr/lib/python2.7/dist-packages/storm/database.py", line 500, in connect
2019-10-27 07:13:53 DEBUG db-relation-changed return self.connection_factory(self, event)
2019-10-27 07:13:53 DEBUG db-relation-changed File "/usr/lib/python2.7/dist-packages/storm/database.py", line 188, in __init__
2019-10-27 07:13:53 DEBUG db-relation-changed self._raw_connection = self._database.raw_connect()
2019-10-27 07:13:53 DEBUG db-relation-changed File "/usr/lib/python2.7/dist-packages/storm/databases/postgres.py", line 410, in raw_connect
2019-10-27 07:13:53 DEBUG db-relation-changed raw_connection = psycopg2.connect(self._dsn)
2019-10-27 07:13:53 DEBUG db-relation-changed File "/usr/lib/python2.7/dist-packages/psycopg2/__init__.py", line 130, in connect
2019-10-27 07:13:53 DEBUG db-relation-changed conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
2019-10-27 07:13:53 DEBUG db-relation-changed psycopg2.OperationalError: FATAL: pg_hba.conf rejects connection for host "10.244.40.202", user "jujuadmin_landscape-server", database "postgres", SSL on
2019-10-27 07:13:53 DEBUG db-relation-changed FATAL: pg_hba.conf rejects connection for host "10.244.40.202", user "jujuadmin_landscape-server", database "postgres", SSL off
2019-10-27 07:13:53 DEBUG db-relation-changed
2019-10-27 07:13:53 DEBUG db-relation-changed Traceback (most recent call last):
2019-10-27 07:13:53 DEBUG db-relation-changed File "/var/lib/juju/agents/unit-landscape-server-2/charm/hooks/db-relation-changed", line 9, in <module>
2019-10-27 07:13:53 DEBUG db-relation-changed sys.exit(hook())
2019-10-27 07:13:53 DEBUG db-relation-changed File "/var/lib/juju/agents/unit-landscape-server-2/charm/hooks/lib/hook.py", line 24, in __call__
2019-10-27 07:13:53 DEBUG db-relation-changed self._run()
2019-10-27 07:13:53 DEBUG db-relation-changed File "/var/lib/juju/agents/unit-landscape-server-2/charm/hooks/lib/services.py", line 92, in _run
2019-10-27 07:13:53 DEBUG db-relation-changed manager.manage()
2019-10-27 07:13:53 DEBUG db-relation-changed File "/var/lib/juju/agents/unit-landscape-server-2/charm/hooks/charmhelpers/core/services/base.py", line 135, in manage
2019-10-27 07:13:53 DEBUG db-relation-changed self.reconfigure_services()
2019-10-27 07:13:53 DEBUG db-relation-changed File "/var/lib/juju/agents/unit-landscape-server-2/charm/hooks/charmhelpers/core/services/base.py", line 189, in reconfigure_services
2019-10-27 07:13:53 DEBUG db-relation-changed self.fire_event('data_ready', service_name)
2019-10-27 07:13:53 DEBUG db-relation-changed File "/var/lib/juju/agents/unit-landscape-server-2/charm/hooks/charmhelpers/core/services/base.py", line 234, in fire_event
2019-10-27 07:13:53 DEBUG db-relation-changed callback(self, service_name, event_name)
2019-10-27 07:13:53 DEBUG db-relation-changed File "/var/lib/juju/agents/unit-landscape-server-2/charm/hooks/lib/callbacks/scripts.py", line 41, in __call__
2019-10-27 07:13:53 DEBUG db-relation-changed self._run(SCHEMA_SCRIPT, options)
2019-10-27 07:13:53 DEBUG db-relation-changed File "/var/lib/juju/agents/unit-landscape-server-2/charm/hooks/lib/callbacks/scripts.py", line 28, in _run
2019-10-27 07:13:53 DEBUG db-relation-changed self._subprocess.check_call(command)
2019-10-27 07:13:53 DEBUG db-relation-changed File "/usr/lib/python2.7/subprocess.py", line 190, in check_call
2019-10-27 07:13:53 DEBUG db-relation-changed raise CalledProcessError(retcode, cmd)
2019-10-27 07:13:53 DEBUG db-relation-changed subprocess.CalledProcessError: Command '['/usr/bin/landscape-schema', '--bootstrap']' returned non-zero exit status 1

I'll attach the crashdump and bundle.yaml for this deployment

Revision history for this message
Alexander Balderson (asbalderson) wrote :
Revision history for this message
Alexander Balderson (asbalderson) wrote :

attaching bundle

Revision history for this message
Simon Poirier (simpoir) wrote :

Thank you for reporting this.
I am having trouble reproducing this issue locally. I've tried with landscape-scalable bundle+LXD and various scales but could not get that failure. It looks like a race condition. Also, it looks like a volatile error, as postgres logs contains successful connections from the same source host seconds later.
Could you confirm the later assumption (that the deployment is successful despite the error)

There aren't that many changes in the charm, so I suspect this change could be the one to blame:

https://code.launchpad.net/~verterok/landscape-charm/support-postgresql-charm-v2-protocol/+merge/370150

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

Simon, we run all of our tests with retries in juju disabled. This is so we can catch race conditions like this, and because we need it to work reliably with juju-wait, which will fail if it sees even a transient error. So, any failure is a problem for us even if it would recover otherwise, and we don't know if this one does, because we never test with retries on.

Revision history for this message
Michael Skalka (mskalka) wrote :

Hit this again, crashdump attached.

Revision history for this message
Joshua Genet (genet022) wrote :

And here's another similar run that hit it.

Revision history for this message
Márton Kiss (marton-kiss) wrote :

I've experienced the same with rev 37, and the charm was staying in a permanent error status. The juju resolve landscape-server solved the issue.

Stacktrace: https://pastebin.ubuntu.com/p/bD3GJCYMHh/

This error is causing the larger bundle's automated deployment to stop (juju-wait), and requires a manual intervention.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.