Fuel for OpenStack

[upgrade] Upgarde script restores old DB dump while upgrading second time

Bug #1349833 reported by Artem Panchenko on 2014-07-29

This bug affects 2 people

	Status	Importance	Assigned to	Milestone
Fuel for OpenStack	Invalid	High	Andrey Sledzinskiy	Fuel for OpenStack 5.1
5.0.x	Fix Committed	High	Evgeniy L	Fuel for OpenStack 5.0.1
6.0.x	Invalid	High	Andrey Sledzinskiy	Fuel for OpenStack 6.0

Bug Description

Steps to reproduce:

1. Run upgrade from 5.0 to 5.1 and interrupt it during launch of new 5.1 containers (in my case it failed because of https://bugs.launchpad.net/fuel/+bug/1349287). Automatic rollback successfully recovered old 5.0 container and everything works fine.
2. Run upgrade from 5.0 to 5.0.1 and it is successful.
3. Deploy new environment using 5.0.1 release
4. Run upgrade from 5.0.1 to 5.1 and it is successful. Check the environment deployed on step # 3.

Expected result:

- environment exists and cluster works fine

Actual result:

- created environment doesn't exist (in nailgun DB)

When you start Fuel upgrade to some X.Y version, upgrade script copies /etc/fuel/version.yaml to the /var/lib/fuel_upgrade/X.Y/ directory and use it for further upgrades. During upgrade from 5.0.1 to 5.1 (step #4) script tried to dump postgresql database from old 5.0 container, but it was down (5.0.1 containers were running), so it restored already existing dump of DB, here is the part of upgrade log:

2014-07-29 10:53:01 DEBUG 43074 (docker_engine) Backup database
2014-07-29 10:53:01 DEBUG 43074 (docker_engine) Failed to make database dump, will be used dump from previous run: Cannot find running container with name "fuel-core-5.0-postgres"
2014-07-29 10:53:01 DEBUG 43074 (utils) Check if file "/var/lib/fuel_upgrade/5.1/pg_dump_all.sql.1" matches to pattern "['-- PostgreSQL database cluster dump', '-- PostgreSQL database dump', '-- Postgre
SQL database dump complete', '-- PostgreSQL database cluster dump complete']"
2014-07-29 10:53:01 DEBUG 43074 (utils) Creating hardlink "/var/lib/fuel_upgrade/5.1/pg_dump_all.sql.1" -> "/var/lib/fuel_upgrade/5.1/pg_dump_all.sql" [overwrite=1]
2014-07-29 10:53:01 DEBUG 43074 (utils) Remove file "/var/lib/fuel_upgrade/5.1/pg_dump_all.sql"

I guess we should remove /var/lib/fuel_upgrade/X.Y/version.yaml file after upgrade to X.Y version if it is successful.

Tags:

Artem Panchenko (apanchenko-8) on 2014-07-29

tags:

added: upgrade

Ihor Kalnytskyi (ikalnytskyi) on 2014-07-29

Changed in fuel:
importance:	Undecided → High

Revision history for this message

Evgeniy L (rustyrobot) wrote on 2014-07-29:

Removed from 5.0 because it's not critical.

no longer affects:

fuel/5.0.x

Ihor Kalnytskyi (ikalnytskyi) on 2014-07-29

Changed in fuel:
status:	New → Confirmed

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-07-31: Related fix proposed to fuel-web (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/110900

Evgeniy L (rustyrobot) on 2014-07-31

Changed in fuel:
assignee:	Fuel Python Team (fuel-python) → Evgeniy L (rustyrobot)
status:	Confirmed → In Progress

Revision history for this message

Evgeniy L (rustyrobot) wrote on 2014-07-31:

I've created a patch in master but it will help only in similar cases for >5.1 upgrade tarballs, to solve the problem with 5.0.1 tarball we need to backport it.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-07-31: Related fix proposed to fuel-web (stable/5.0)

Related fix proposed to branch: stable/5.0
Review: https://review.openstack.org/110916

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-08-01: Related fix merged to fuel-web (stable/5.0)

Reviewed: https://review.openstack.org/110916
Committed: https://git.openstack.org/cgit/stackforge/fuel-web/commit/?id=dbd29fd8ad27be4f7c88ec7a4bab25a685c1e700
Submitter: Jenkins
Branch: stable/5.0

commit dbd29fd8ad27be4f7c88ec7a4bab25a685c1e700
Author: Evgeniy L <email address hidden>
Date: Thu Jul 31 14:55:06 2014 +0400

Upgrades, remove saved version file on success

    * created on_success method which upgrade
      script runs if upgrade succeed, don't
      fail upgrade in case of errors
    * remove saved version files for all
      upgrades from working directories

It solves several problems:

    1. user runs upgrade 5.0 -> 5.1 which fails
    upgrade system saves version which we upgrade
    from in file working_dir/5.1/version.yaml.
    Then user runs upgrade 5.0 -> 5.0.1 which
    successfully upgraded. Then user runs again
    upgrade 5.0.1 -> 5.1, but there is saved file
    working_dir/5.1/version.yaml which contains
    5.0 version, and upgrade system thinks that
    it's upgrading from 5.0 version, as result
    it tries to make database dump from wrong
    version of container.

    2. without this hack user can run upgrade
    second time and loose his data, this hack
    prevents this case because before upgrade
    checker will use current version instead
    of saved version to determine version which
    we run upgrade from.

Change-Id: I5e6ae6ba2ae2e60b9812e131d2a7c533f4a38ab6
Related-bug: #1349833

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-08-01: Related fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/110900
Committed: https://git.openstack.org/cgit/stackforge/fuel-web/commit/?id=001ffbd1562b96cb6bf82ddbd461449901489200
Submitter: Jenkins
Branch: master

commit 001ffbd1562b96cb6bf82ddbd461449901489200
Author: Evgeniy L <email address hidden>
Date: Thu Jul 31 14:55:06 2014 +0400

Upgrades, remove saved version file on success

It solves several problems:

Change-Id: I5e6ae6ba2ae2e60b9812e131d2a7c533f4a38ab6
Related-bug: #1349833

Evgeniy L (rustyrobot) on 2014-08-01

Changed in fuel:
status:	In Progress → Fix Committed

Revision history for this message

Andrey Sledzinskiy (asledzinskiy) wrote on 2014-09-23:

Upgrade failed fater next steps:

1. Run upgrade from 5.0 to 5.1 and fail upgrade with modifying /engines/openstack.py file adding some exception (fuel-5.1-upgrade-11-2014-09-17_21-40-34.tar.lrz)
2.After successful rollback run upgrade from 5.0 to 5.0.1

Expected - upgrade is successful
Actual - upgrade failed with
2014-09-23 08:56:51 DEBUG 10307 (utils) Execute command "docker cp fuel-core-5.0-astute:/var/lib/astute /var/lib/fuel_upgrade/5.0.1"
2014-09-23 08:56:51 DEBUG 10307 (utils) Stdout and stderr of command "docker cp fuel-core-5.0-astute:/var/lib/astute /var/lib/fuel_upgrade/5.0.1":
2014-09-23 08:56:51 DEBUG 10307 (utils) 2014/09/23 08:56:51 Error: Could not find the file /var/lib/astute in container fuel-core-5.0-astute
2014-09-23 08:56:51 INFO 10307 (supervisor_client) Stop all services
2014-09-23 08:56:51 ERROR 10307 (upgrade) DockerUpgrader: failed to upgrade: "<Fault 6: 'SHUTDOWN_STATE'>"
Traceback (most recent call last):
  File "/var/upgrade/site-packages/fuel_upgrade/upgrade.py", line 56, in run
    upgrader.upgrade()
  File "/var/upgrade/site-packages/fuel_upgrade/engines/docker_engine.py", line 76, in upgrade
    self.supervisor.stop_all_services()
  File "/var/upgrade/site-packages/fuel_upgrade/supervisor_client.py", line 125, in stop_all_services
    self.supervisor.stopAllProcesses()
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request
    verbose=self.__verbose
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1253, in request
    return self._parse_response(h.getfile(), sock)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1392, in _parse_response
    return u.close()
  File "/usr/lib64/python2.6/xmlrpclib.py", line 838, in close
    raise Fault(**self._stack[0])
Fault: <Fault 6: 'SHUTDOWN_STATE'>
2014-09-23 08:56:51 DEBUG 10307 (upgrade) Run rollback