Newton upgrade fails due to bind mount of /var/log

Bug #1625722 reported by James Denton
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Fix Released
High
Andy McCrae

Bug Description

Recently performed an upgrade from a mid-August Newton deploy to recent master deploy. Various tasks failed due to the inability for the respective services to start inside their containers. Issues were experience with Neutron, Galera, Nova containers, among others.

The issues seems to be that the /var/log directory is now a bind mount to the host. As a result, log directories no longer exist when the container is restarted during upgrade and many services refuse to start. For some services, creating the respective log directory is enough. A more reliable method is to monkey with the mounts and copy the data from /var/log in the container to the host for presentation through the bind mount afterwards. An example of how this was done for Galera can be seen here: http://paste.openstack.org/show/582280/

Another user upgrading from Mitaka/stable reports the same issue during upgrade.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible (master)

Fix proposed to branch: master
Review: https://review.openstack.org/373561

Changed in openstack-ansible:
assignee: nobody → Kevin Carter (kevin-carter)
status: New → In Progress
Changed in openstack-ansible:
importance: Undecided → High
Revision history for this message
Castulo J. Martinez (castulo-martinez) wrote :
Download full text (6.1 KiB)

I just tried upgrading a stable/mitaka OpenStack deployment in an onMetal server to master, and it failed with these errors below, most of them related to Galera:

ASK [galera_server : Confirm service connectivity] ****************************
FAILED - RETRYING: TASK: galera_server : Confirm service connectivity (1 retries left).
changed: [infra01_galera_container-f664885c]

cmd: /usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf ping

start: 2016-09-22 00:59:21.388814

end: 2016-09-22 00:59:21.392056

delta: 0:00:00.003242

stderr: /usr/bin/mysqladmin: connect to server at 'localhost' failed
error: 'Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111 "Connection refused")'
Check that mysqld is running and that the socket: '/var/run/mysqld/mysqld.sock' exists!

cmd: /usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf ping

start: 2016-09-22 00:59:21.388814

end: 2016-09-22 00:59:21.392056

delta: 0:00:00.003242

stderr: /usr/bin/mysqladmin: connect to server at 'localhost' failed
error: 'Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111 "Connection refused")'
Check that mysqld is running and that the socket: '/var/run/mysqld/mysqld.sock' exists!

TASK [galera_server : Check that WSREP is ready] *******************************
skipping: [infra01_galera_container-f664885c]

...
...

TASK [galera_server : Gather mysql facts] **************************************
fatal: [infra01_galera_container-f664885c]: FAILED! => {"changed": false, "failed": true, "msg": "Mysql fact collection failed: \"ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1' (111 \"Connection refused\")\"."}
...ignoring

msg: Mysql fact collection failed: "ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1' (111 "Connection refused")".

msg: Mysql fact collection failed: "ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1' (111 "Connection refused")".

TASK [galera_server : Check for cluster state failure] *************************

TASK [galera_server : Check for cluster state failure] *************************
skipping: [infra01_galera_container-f664885c] => (item=infra01_galera_container-f664885c)

...
...

TASK [galera_server : Install galera packages] *********************************
FAILED - RETRYING: TASK: galera_server : Install galera packages (5 retries left).
FAILED - RETRYING: TASK: galera_server : Install galera packages (4 retries left).
FAILED - RETRYING: TASK: galera_server : Install galera packages (3 retries left).
FAILED - RETRYING: TASK: galera_server : Install galera packages (2 retries left).
FAILED - RETRYING: TASK: galera_server : Install galera packages (1 retries left).
failed: [infra01_galera_container-f664885c] (item=[u'mariadb-client', u'mariadb-galera-server-10.0', u'galera-3', u'rsync', u'socat']) => {"cache_update_time": 0, "cache_updated": false, "failed": true, "item": ["mariadb-client", "mariadb-galera-server-10.0", "galera-3", "rsync", "socat"], "msg": "'/usr/bin/apt-get -y -o \"Dpkg::Options::=--force-confdef\" -o \"Dpkg::Options::=--force-confold\" install 'mariadb-galera-server-10.0'' failed: E: Package 'mariadb-gal...

Read more...

Changed in openstack-ansible:
assignee: Kevin Carter (kevin-carter) → Andy McCrae (andrew-mccrae)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (master)

Reviewed: https://review.openstack.org/373561
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=d98eb49edc4934ac3a18d49748d2b569e9db7520
Submitter: Jenkins
Branch: master

commit d98eb49edc4934ac3a18d49748d2b569e9db7520
Author: Kevin Carter <email address hidden>
Date: Tue Sep 20 19:11:15 2016 -0500

    Change default log bind mount to be optional

    The default log bind mount will cause service disruption as containers
    will need to be rebooted on top of the fact that some services will not
    start until log directories are created for the service to write to.
    While this is not an issue on greenfield it is a problem on all
    upgrades.

    Closes-Bug: #1625722
    Change-Id: Ic2345012be9d6b46a33da6bd03ccc397d5655a50
    Signed-off-by: Kevin Carter <email address hidden>

Changed in openstack-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible 14.0.0.0rc2

This issue was fixed in the openstack/openstack-ansible 14.0.0.0rc2 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.