Postgres guest fails resize-instance

Bug #1445651 reported by Doug Shelley
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack DBaaS (Trove)
Fix Released
High
Alex Tomic

Bug Description

Doing some testing of PostgreSQL on Ubuntu 14.04. When I tried trove resize-instance, the Nova instance was stuck in "VERIFY_RESIZE" and the trove instance in "RESIZE".

Noticed this in trove-guestagent.log:
2015-04-17 19:26:11.807 1183 DEBUG trove.guestagent.datastore.postgresql.service.config [-] e78f2f59-e8ae-4fb2-afb1-66c3ca416efb: Polling for postgresql version. _get_psql_version /usr/lib/python2.7/dist-packages/trove/guestagent/datastore/postgresql/service/config.py:47
2015-04-17 19:26:11.808 1183 DEBUG trove.guestagent.datastore.postgresql.pgutil [-] Running as postgres: ('psql', '--version') execute /usr/lib/python2.7/dist-packages/trove/guestagent/datastore/postgresql/pgutil.py:29
2015-04-17 19:26:11.808 1183 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): sudo -u postgres psql --version execute /usr/lib/python2.7/dist-packages/oslo_concurrency/processutils.py:199
2015-04-17 19:26:11.843 1183 DEBUG oslo_concurrency.processutils [-] CMD "sudo -u postgres psql --version" returned: 1 in 0.034s execute /usr/lib/python2.7/dist-packages/oslo_concurrency/processutils.py:225
2015-04-17 19:26:11.844 1183 DEBUG oslo_concurrency.processutils [-] u'sudo -u postgres psql --version' failed. Not Retrying. execute /usr/lib/python2.7/dist-packages/oslo_concurrency/processutils.py:258
2015-04-17 19:26:11.845 1183 ERROR oslo_messaging.rpc.dispatcher [-] Exception during message handling: Unexpected error while running command.
Command: sudo -u postgres psql --versionExit code: 1
Stdout: u''
Stderr: u'Error: Cannot stat /var/run/postgresql\n'
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher executor_callback))
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher executor_callback)
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args)
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 105, in wrapper
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher return f(*args, **kwargs)2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/trove/guestagent/datastore/postgresql/service/config.py", line 121, in start_db_with_conf_changes
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher self.reset_configuration(context, config_contents)
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 105, in wrapper
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher return f(*args, **kwargs)
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/trove/guestagent/datastore/postgresql/service/config.py", line 61, in reset_configuration
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher version=self._get_psql_version(),
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/trove/guestagent/datastore/postgresql/service/config.py", line 50, in _get_psql_version
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher out, err = pgutil.execute('psql', '--version', timeout=30)
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/trove/guestagent/datastore/postgresql/pgutil.py", line 31, in execute
2015-04-17 19:26:11.845 1183 TRACE oslo_messaging.rpc.dispatcher "sudo", "-u", "postgres", *command, **kwargs

I believe the issue is that during the start of resize instance processing Postgres is stopped and the service is disabled on boot. This means that when nova resizes it and boots it, the /var/run/postgres directory is missing. It appears that "psql --version" needs this directory to exist in order to run. It doesn't get created until postgres is started.

Doug Shelley (0-doug)
Changed in trove:
assignee: nobody → Alex Tomic (atomic777)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to trove (master)

Fix proposed to branch: master
Review: https://review.openstack.org/176035

Changed in trove:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to trove (master)

Reviewed: https://review.openstack.org/176035
Committed: https://git.openstack.org/cgit/openstack/trove/commit/?id=c7b2fd683639fbc1682decab924b557ffbfefdc7
Submitter: Jenkins
Branch: master

commit c7b2fd683639fbc1682decab924b557ffbfefdc7
Author: Alex Tomic <email address hidden>
Date: Tue Apr 21 15:18:29 2015 -0400

    Add unix_socket_directories setting for pgsql

    When an existing pgsql instance is resized, we restart in a state
    where /var/run/postgresql does not exist and the postgresql
    service is initially disabled. If /var/run/postgresql doesn't exist
    and unix_socket_directories is not specified (as was the case for
    us), the ubuntu/debian scripts that wrap many postgresql binaries
    including psql will fail

    A bug 1446811 has been filed upstream to the maintainers of
    package postgresql-common.

    Change-Id: I69a8f13f3577b3c37a75d7323eaabdf396b5b5e7
    Closes-Bug: #1445651

Changed in trove:
status: In Progress → Fix Committed
Changed in trove:
milestone: none → liberty-1
importance: Undecided → High
Thierry Carrez (ttx)
Changed in trove:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in trove:
milestone: liberty-1 → 4.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.