upgrade.sh scripts should wait until all upgraded containers are started

Bug #1453400 reported by Dennis Dmitriev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Matthew Mosesohn

Bug Description

Reproduced on CI test 'deploy_ha_after_upgrade_diagnostic-logs' :http://jenkins-product.srt.mirantis.net:8080/job/6.1.system_test.ubuntu.ceph_multinode_compact_neutron.upgrade/36/

Scenario:
    1) Install Fuel v6.0 admin node;
    2) Unpack fuel-6.1-upgrade-*.lrz to /var
    3) Start upgrade: '/var/upgrade.sh --no-rollback --password admin'

Upgrade script can fail after start the upgraded containers with the following error (http://paste.openstack.org/show/218066/) :

2015-05-09 04:04:34 ERROR 28299 (upgrade) OpenStackUpgrader: failed to upgrade: "502 Server Error: Bad Gateway"

Manual reproducing of the bug can be successful from time to time.

Revision history for this message
Dennis Dmitriev (ddmitriev) wrote :
Changed in fuel:
importance: Undecided → High
Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :

Root cause here is blinking of services in start.sh of containers. We should either extend our timeout or remove blinking.

Changed in fuel:
status: New → Confirmed
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

It seems we don't need to start nailgun during puppet run in order to do syncdb and upload_fixtures. I'll disable starting supervisord until the end of start.sh to prevent this weird upgrade issue.

Dmitry Pyzhov (dpyzhov)
Changed in fuel:
assignee: Fuel Python Team (fuel-python) → Matthew Mosesohn (raytrac3r)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/182246

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/182246
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=682eed618fbbb7b38d753a55672f50d599a2c441
Submitter: Jenkins
Branch: master

commit 682eed618fbbb7b38d753a55672f50d599a2c441
Author: Matthew Mosesohn <email address hidden>
Date: Tue May 12 14:33:58 2015 +0300

    Inhibit start of supervisord for nailgun

    Starting supervisord starts nailgun app too early
    before it is ready to serve requests. It will be
    started later in the docker container in the
    foreground. This prevents a race condition where
    DB upgrade takes place and nailgun is stopped.

    Change-Id: Icd27758f00e8f472041eb83019cb89dcb030afd1
    Closes-Bug: #1453400

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Maksym Strukov (unbelll) wrote :

Verified as fixed in 6.1-432

Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.