[upgrade] octane upgrade-node (controllers) fails on ceph_ready_check task

Bug #1622642 reported by Maksim Shkrebtan
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
Critical
Sergey Abramov

Bug Description

Detailed bug description:

I am trying to upgrade a 7.0 environment to 9.0.

My env consists of:
- 3 computes + ceph-osd;
- 3 controllers;
- 1 mongodb node.

Steps to reproduce:

1. Create a seed environment with `octane upgrade-env`.
2. Upgrade the primary controller.
3. Upgrade the control plane.
4. Upgrade the rest of the controllers.

Expected results:
Upgrade to be finished successfully.

Actual result:
The upgrade fails during deployment on task: ceph_ready_check.

2016-09-12 12:26:56 +0000 /Stage[main]/Osnailyfacter::Ceph::Enable_rados/Osnailyfacter::Wait_for_backend[object-storage]/Haproxy_backend_status[object-storage]/ensure (err): change from down to up failed: Timeout waiting for backend: 'object-storage' status to become: 'up' after 600 seconds!

# shotgun2 short-report
cat /etc/fuel_build_id:
 495
cat /etc/fuel_build_number:
 495
cat /etc/fuel_release:
 9.0
cat /etc/fuel_openstack_version:
 mitaka-9.0
rpm -qa | egrep 'fuel|astute|network-checker|nailgun|packetary|shotgun':
 fuel-release-9.0.0-1.mos6349.noarch
 fuel-misc-9.0.0-1.mos8460.noarch
 python-packetary-9.0.0-1.mos140.noarch
 fuel-bootstrap-cli-9.0.0-1.mos285.noarch
 fuel-migrate-9.0.0-1.mos8460.noarch
 fuel-nailgun-extension-cluster-upgrade-9.1-1.mos76.git.cc57647.noarch
 rubygem-astute-9.0.0-1.mos765.git.3a15d8a.noarch
 shotgun-9.0.0-1.mos90.noarch
 fuel-notify-9.0.0-1.mos8460.noarch
 nailgun-mcagents-9.0.0-1.mos750.noarch
 python-fuelclient-9.0.0-1.mos325.noarch
 fuel-9.0.0-1.mos6349.noarch
 fuel-utils-9.0.0-1.mos8460.noarch
 fuel-setup-9.0.0-1.mos6349.noarch
 fuel-provisioning-scripts-9.0.0-1.mos8743.noarch
 fuel-library9.0-9.0.0-1.mos8460.noarch
 network-checker-9.0.0-1.mos74.x86_64
 fuel-agent-9.0.0-1.mos285.noarch
 fuel-ui-9.0.0-1.mos2717.noarch
 fuel-ostf-9.0.0-1.mos936.noarch
 fuel-octane-9.0.0-1.mos1313.git.378a8d7.noarch
 fuel-nailgun-9.0.0-1.mos8817.git.42f21eb.noarch
 fuelmenu-9.0.0-1.mos274.noarch
 fuel-mirror-9.0.0-1.mos140.noarch
 fuel-openstack-metadata-9.0.0-1.mos8743.noarch

Revision history for this message
Maksim Shkrebtan (mshkrebtan-9) wrote :
sryabin (sryabin)
Changed in fuel:
milestone: none → 9.1
importance: Undecided → Critical
status: New → Confirmed
description: updated
sryabin (sryabin)
Changed in fuel:
assignee: nobody → Sergey Abramov (sabramov)
assignee: Sergey Abramov (sabramov) → nobody
assignee: nobody → Sergey Abramov (sabramov)
tags: added: blocker-for-qa
Ilya Kharin (akscram)
Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Ilya Kharin (akscram) wrote :

It seems that this problem has the same cause that the next one: https://bugs.launchpad.net/fuel/+bug/1622543

Revision history for this message
Vladimir Khlyunev (vkhlyunev) wrote :

I made clean upgrade run without snapshoting of my env - and this issue was not reproduced; we will spend a bit time for investigation but probably it can be ceph+snapshot+time-sync issue

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-qa (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/371690

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-qa (stable/mitaka)

Reviewed: https://review.openstack.org/370294
Committed: https://git.openstack.org/cgit/openstack/fuel-qa/commit/?id=18bbd886e6751ff6bd4097f0c901a8303084f7cc
Submitter: Jenkins
Branch: stable/mitaka

commit 18bbd886e6751ff6bd4097f0c901a8303084f7cc
Author: Vladimir Khlyunev <email address hidden>
Date: Wed Sep 14 19:40:59 2016 +0300

    Remove mandatory snapshoting in cluster upgrade

    During testing we got several issues in ceph + clock screw.
    We need snapshots only for development testing, testing on CI
    can be done without snapshots. Removing is_make=True argument
    and add additional sync_time call if we are in debugging mode.

    Related-bug:1622642
    Change-Id: I28c5c2a84a0c732d01f78d09deeb86d81a295fd5

tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-qa (master)

Reviewed: https://review.openstack.org/371690
Committed: https://git.openstack.org/cgit/openstack/fuel-qa/commit/?id=62940c9768eab784c82cc2632d5a82839d37eef9
Submitter: Jenkins
Branch: master

commit 62940c9768eab784c82cc2632d5a82839d37eef9
Author: Vladimir Khlyunev <email address hidden>
Date: Wed Sep 14 19:40:59 2016 +0300

    Remove mandatory snapshoting in cluster upgrade

    During testing we got several issues in ceph + clock screw.
    We need snapshots only for development testing, testing on CI
    can be done without snapshots. Removing is_make=True argument
    and add additional sync_time call if we are in debugging mode.

    Related-bug:1622642
    Change-Id: I28c5c2a84a0c732d01f78d09deeb86d81a295fd5

Revision history for this message
Vladimir Khlyunev (vkhlyunev) wrote :

After https://review.openstack.org/370294 the problem has gone - so this is really ceph + incorrect time

Changed in fuel:
status: In Progress → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.