Docker dies during the upgrade

Bug #1359725 reported by Egor Kotko
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
Medium
Ihor Kalnytskyi

Bug Description

VERSION:
  mirantis: "yes"
  production: "docker"
  release: "5.0.1"
  api: "1.0"
  build_number: "170"
  build_id: "2014-08-14_19-52-36"
  astute_sha: "6db5f5031b74e67b92fcac1f7998eaa296d68025"
  fuellib_sha: "a31dbac8fff9cf6bc4cd0d23459670e34b27a9ab"
  ostf_sha: "09b6bccf7d476771ac859bb3c76c9ebec9da9e1f"
  nailgun_sha: "af3d1922bfc21345f81be3454115ab6139675c35"
  fuelmain_sha: "fd58828f404e4298ed338e8f44c6a326cebd31de"

Steps to reproduce:
1. Create env via fuel cli:
Ubuntu, HA, Neutron VLAN, 3 Controllers, 2Computes, 2Cephs, Ceph for volumes and images
2. Upgrade master with HTTP_LINK=http://mc0n1-msk.msk.mirantis.net/fuelweb-iso/fuel-master-upgrade-464-2014-08-21_02-01-17.tar
#./upgrade.sh

Expected result :
Version of cluster will be 5.1

Actual result:
http://paste.openstack.org/show/98162/

Revision history for this message
Egor Kotko (ykotko) wrote :
Kamil Sambor (ksambor)
Changed in fuel:
assignee: Fuel Python Team (fuel-python) → Kamil Sambor (ksambor)
assignee: Kamil Sambor (ksambor) → Fuel Python Team (fuel-python)
Revision history for this message
Evgeniy L (rustyrobot) wrote :

Yeah, I saw it on customers env, it looks like docker just dies during the upgrade, we need some workaround in upgrade script and instruction for user how to fix it manually.

Changed in fuel:
status: New → Confirmed
Revision history for this message
Ihor Kalnytskyi (ikalnytskyi) wrote :

It seems like the workaround is:

    rm /var/run/docker.pid
    /etc/init.d/docker start
    dockerctl start all

need to retest.

Evgeniy L (rustyrobot)
summary: - Fuel upgrade from 5.0.1->5.1 on env deployed via cli finished with error
+ Docker dies during the upgrade
Revision history for this message
Evgeniy L (rustyrobot) wrote :

Yeah, Igor is right, but I'm afraid it's not enough.
We need to run `/etc/init.d/docker stop` then umount all devicemapper mounting points [1] which related to docker and only after that run `/etc/init.d/docker start`.

The solution should look like:
1. create separate docker client
2. in this client add special wrapper for each http call, when there is UnixHTTPConnectionPool exception
2.1 try to start docker
2.2 run the action again

[1] https://github.com/docker/docker/issues/6675

Changed in fuel:
status: Confirmed → Triaged
Evgeniy L (rustyrobot)
tags: added: release-notes
Revision history for this message
Evgeniy L (rustyrobot) wrote :

# Here is instruction how to repair env after docker's death

If upgrade failed with the next error

ConnectionError: UnixHTTPConnectionPool(host='localhost', port=None): Max retries exceeded with url: /run/docker.sock/v1.10/containers/bd7cf47c7ca428849f52af8696929732a9fccbe848a39f78e
24ef7b8bd4915f2/stop?t=10 (Caused by <class 'httplib.BadStatusLine'>: )

Try to run `docker ps`

[root@fuel ~]# docker ps
2014/08/19 14:28:15 Cannot connect to the Docker daemon. Is 'docker -d' running on this host?

If as result you got docker which is not running then we have to umount docker volumes and then run docker again

[root@fuel ~]# umount -l $(cat /proc/mounts | grep '/dev/mapper/docker-' | awk '{ print $2}')
[root@fuel ~]# rm /var/run/docker.pid
[root@fuel ~]# service docker start

Run upgrade again from the directory with unarchived upgrade tar-ball

[root@fuel ~]# ./upgrade.sh

Revision history for this message
Evgeniy L (rustyrobot) wrote :

Reduced priority because it's really hard to reproduce.

Changed in fuel:
importance: High → Medium
status: Triaged → Won't Fix
Revision history for this message
Evgeniy L (rustyrobot) wrote :

Set as won't fix for 5.1 because it has medium priority (hard to reproduce), and targeted to 6.0

Revision history for this message
Evgeniy L (rustyrobot) wrote :

For 5.1 added documentation in order to provide user steps to restore his env after docker's death

https://github.com/stackforge/fuel-web/commit/baaeee59026cdad605532dd4e6db04cb6bad306a

Dmitry Pyzhov (dpyzhov)
no longer affects: fuel/6.0.x
Changed in fuel:
status: Won't Fix → Triaged
importance: Medium → High
milestone: 5.1 → 6.0
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: 6.0 → 6.1
importance: High → Medium
Changed in fuel:
assignee: Fuel Python Team (fuel-python) → Igor Kalnitsky (ikalnitsky)
Revision history for this message
Ihor Kalnytskyi (ikalnytskyi) wrote :
Revision history for this message
Evgeniy L (rustyrobot) wrote :

I think we can move it to won't fix, because blueprint has "implemented" status.

Changed in fuel:
status: Triaged → Won't Fix
Revision history for this message
Evgeniy L (rustyrobot) wrote :

Sorry, "Fix committed" is better status.

Changed in fuel:
status: Won't Fix → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.