Docker service can't be started after upgrade fuel node

Bug #1704367 reported by Ilya Bumarskov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
Critical
MOS Maintenance

Bug Description

Steps to reproduce:

1. Install latest Fuel 8.0 with MU
2. Deploy any cluster
3. Try to upgrade the Fuel master node to proposed MU using following instruction: https://docs.mirantis.com/openstack/fuel/fuel-8.0/maintenance-updates.html#how-to-apply-mirantis-openstack-8-0-maintenance-update

Observed behaviour:
After restart master node docker service can't be started:

[root@nailgun ~]# sudo systemctl status docker.service -l
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2017-07-11 11:06:26 UTC; 8min ago
     Docs: http://docs.docker.com
  Process: 2216 ExecStart=/usr/bin/docker-current daemon --exec-opt native.cgroupdriver=systemd $OPTIONS $DOCKER_STORAGE_OPTIONS $DOCKER_NETWORK_OPTIONS $ADD_REGISTRY $BLOCK_REGISTRY $INSECURE_REGISTRY (code=exited, status=1/FAILURE)
 Main PID: 2216 (code=exited, status=1/FAILURE)

Jul 11 11:06:26 nailgun.test.domain.local systemd[2216]: Executing: /usr/bin/docker-current daemon --exec-opt native.cgroupdriver=systemd --selinux-enabled --log-driver=journald --storage-driver devicemapper --storage-opt dm.fs=xfs --storage-opt dm.thinpooldev=/dev/mapper/docker-docker--pool --iptables=false
Jul 11 11:06:26 nailgun.test.domain.local docker-current[2216]: time="2017-07-11T11:06:26.765868727Z" level=fatal msg="Error starting daemon: error initializing graphdriver: devmapper: Base Device UUID and Filesystem verification failed.devicemapper: Error running deviceCreate (ActivateDevice) dm_task_run failed"
Jul 11 11:06:26 nailgun.test.domain.local systemd[1]: Child 2216 belongs to docker.service
Jul 11 11:06:26 nailgun.test.domain.local systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
Jul 11 11:06:26 nailgun.test.domain.local systemd[1]: docker.service changed start -> failed
Jul 11 11:06:26 nailgun.test.domain.local systemd[1]: Job docker.service/start finished, result=failed
Jul 11 11:06:26 nailgun.test.domain.local systemd[1]: Failed to start Docker Application Container Engine.
Jul 11 11:06:26 nailgun.test.domain.local systemd[1]: Unit docker.service entered failed state.
Jul 11 11:06:26 nailgun.test.domain.local systemd[1]: docker.service failed.
Jul 11 11:06:26 nailgun.test.domain.local systemd[1]: docker.service: cgroup is empty

Changed in fuel:
importance: Undecided → Critical
assignee: nobody → MOS Maintenance (mos-maintenance)
milestone: none → 8.0-updates
Changed in fuel:
assignee: MOS Maintenance (mos-maintenance) → Alexey Stupnikov (astupnikov)
Changed in fuel:
status: New → Confirmed
Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

I was able to reproduce this issue by simply restarting docker
containers, docker services and rebooting master node twice.

From my point of view this issue is caused by a combination of
suboptimal docker's PV size and a docker bug. I have tested the
workaround from bug #1604941 and it worked just great.

https://bugs.launchpad.net/fuel/+bug/1604941/comments/8

Assigning back to mos-maintenance team as there is no reliable
solution to backport for this bug.

Changed in fuel:
assignee: Alexey Stupnikov (astupnikov) → MOS Maintenance (mos-maintenance)
Revision history for this message
Denis Meltsaykin (dmeltsaykin) wrote :

Setting this to Invalid since on latest SWARM run it has not reproduced.

Changed in fuel:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.