hundreds of snapshots created by docker zfs lead to zsys timeout

Bug #1879473 reported by Jean-Baptiste Lallement on 2020-05-19
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
docker.io (Ubuntu)
High
Didier Roche
Focal
High
Didier Roche

Bug Description

[Rationale]
ZFS module of docker creates many snapshots under <pool>/ROOT/machineid/var/lib/<ID>. These snapshots are processed by zsys and lead to timeout.

This patch migrates the snapshots created by docker under the persistent dataset <pool>/var/lib/docker/ so they are scanned and then ignored by zsys.

[Test Case]
1. On a machine with docker installed, create a docker container:
$ docker run ubuntu
2. Verify that docker created snapshots under <pool>/ROOT/machineid/var/lib/<ID> with zfs list
3. Install the new version
4. Verify that the snapshots have been migrated to <pool>/var/lib/docker/
5. Verify that docker is still working:
$ docker run ubuntu

[Regression Potential]
The change only affects the postinst.
The risk is that dataset are not migrated or partially migrated. Then new snapshots are created on a different location and containers created or run from an existing image won't start.

It impacts only Ubuntu Desktop installation with zfs on root support.

Changed in docker.io (Ubuntu):
status: New → Triaged
importance: Undecided → High
assignee: nobody → Didier Roche (didrocks)
Didier Roche (didrocks) on 2020-05-19
description: updated
description: updated
Didier Roche (didrocks) on 2020-05-19
Changed in docker.io (Ubuntu Focal):
status: New → Triaged
importance: Undecided → High
assignee: nobody → Didier Roche (didrocks)
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package docker.io - 19.03.8-0ubuntu2

---------------
docker.io (19.03.8-0ubuntu2) groovy; urgency=medium

  [ Jean-Baptiste Lallement ]
  [ Didier Roche ]
  * Fix use with ZFS on root:
    - docker creates one dataset for any layer of containers that were
      created. Create now a <pool>/var/lib/docker for creating them in
      the persistent namespace and migrate existing one here.
    - purge the automated historic that was created.
    The migration only impacts the ubuntu desktop installation with
    experimental ZFS on root, and we have thus to stop and start the daemon
    to migrate data. (LP: #1879473)

 -- Didier Roche <email address hidden> Tue, 19 May 2020 11:01:22 +0200

Changed in docker.io (Ubuntu):
status: Triaged → Fix Released
Tianon Gravi (tianon) wrote :

Has this been reported upstream? The fix (in postinst) is large enough that it seems like something upstream should at least be aware of, and is probably something they should implement a fix for as well. Do you think it's something that could/should be fixed in "dockerd" itself?

Didier Roche (didrocks) wrote :

If you have the upstream bug report for the ZFS driver (not docker itself), that would be welcome. I didn’t find anything in the upstream wiki for the driver which is shipped by default in debian/ubuntu.

Didier Roche (didrocks) wrote :

So yeah, the driver should ideally do the same (create directly under rpool/), but it really depends on how people layed out their system.

I think in general upstream should clear up their datasets (for stopped containers), which is more generic.

Tianon Gravi (tianon) wrote :

The "zfs" graph driver is part of Docker itself (https://github.com/moby/moby/tree/89382f2f20745b9e63bed6c066f104980dff4396/daemon/graphdriver/zfs), so https://github.com/moby/moby/issues would be the appropriate place to file an issue (or enhancement request) against it.

Andreas Hasenack (ahasenack) wrote :

The new postinst failed for me in groovy, here is the bug I filed: https://bugs.launchpad.net/ubuntu/+source/docker.io/+bug/1882942

Brian Murray (brian-murray) wrote :

It seems to me like the wiki page referenced in the SRU should mention the version of the package which is being SRU'ed not the version of the package which is in Groovy.

https://github.com/ubuntu/zsys/wiki/Performance-issue-with-docker-on-ubuntu-20.04-LTS

Changed in docker.io (Ubuntu Focal):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-focal

Hello Jean-Baptiste, or anyone else affected,

Accepted docker.io into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/docker.io/19.03.8-0ubuntu1.20.04 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

SRU verification for Focal:
I have reproduced the problem with docker.io 19.03.8-0ubuntu1 in focal and have verified that the version of docker.io 19.03.8-0ubuntu1.20.04 in -proposed fixes the issue.

Marking as verification-done

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal

The verification of the Stable Release Update for docker.io has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package docker.io - 19.03.8-0ubuntu1.20.04

---------------
docker.io (19.03.8-0ubuntu1.20.04) focal; urgency=medium

  [ Jean-Baptiste Lallement ]
  [ Didier Roche ]
  * Fix use with ZFS on root:
    - docker creates one dataset for any layer of containers that were
      created. Create now a <pool>/var/lib/docker for creating them in
      the persistent namespace.
    - don’t migrate existing ones but display a message to the end user on
      ZSys wiki with instrutions.
    The migration only impacts the ubuntu desktop installation with
    experimental ZFS on root. (LP: #1879473)

 -- Didier Roche <email address hidden> Thu, 18 Jun 2020 10:26:54 +0200

Changed in docker.io (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.