hundreds of snapshots created by docker zfs lead to zsys timeout

Bug #1879473 reported by Jean-Baptiste Lallement
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
docker.io (Ubuntu)
Fix Released
High
Didier Roche-Tolomelli
Focal
Fix Released
High
Didier Roche-Tolomelli

Bug Description

[Rationale]
ZFS module of docker creates many snapshots under <pool>/ROOT/machineid/var/lib/<ID>. These snapshots are processed by zsys and lead to timeout.

This patch migrates the snapshots created by docker under the persistent dataset <pool>/var/lib/docker/ so they are scanned and then ignored by zsys.

[Test Case]
1. On a machine with docker installed, create a docker container:
$ docker run ubuntu
2. Verify that docker created snapshots under <pool>/ROOT/machineid/var/lib/<ID> with zfs list
3. Install the new version
4. Verify that the snapshots have been migrated to <pool>/var/lib/docker/
5. Verify that docker is still working:
$ docker run ubuntu

[Regression Potential]
The change only affects the postinst.
The risk is that dataset are not migrated or partially migrated. Then new snapshots are created on a different location and containers created or run from an existing image won't start.

It impacts only Ubuntu Desktop installation with zfs on root support.

Changed in docker.io (Ubuntu):
status: New → Triaged
importance: Undecided → High
assignee: nobody → Didier Roche (didrocks)
description: updated
description: updated
Changed in docker.io (Ubuntu Focal):
status: New → Triaged
importance: Undecided → High
assignee: nobody → Didier Roche (didrocks)
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package docker.io - 19.03.8-0ubuntu2

---------------
docker.io (19.03.8-0ubuntu2) groovy; urgency=medium

  [ Jean-Baptiste Lallement ]
  [ Didier Roche ]
  * Fix use with ZFS on root:
    - docker creates one dataset for any layer of containers that were
      created. Create now a <pool>/var/lib/docker for creating them in
      the persistent namespace and migrate existing one here.
    - purge the automated historic that was created.
    The migration only impacts the ubuntu desktop installation with
    experimental ZFS on root, and we have thus to stop and start the daemon
    to migrate data. (LP: #1879473)

 -- Didier Roche <email address hidden> Tue, 19 May 2020 11:01:22 +0200

Changed in docker.io (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Tianon Gravi (tianon) wrote :

Has this been reported upstream? The fix (in postinst) is large enough that it seems like something upstream should at least be aware of, and is probably something they should implement a fix for as well. Do you think it's something that could/should be fixed in "dockerd" itself?

Revision history for this message
Didier Roche-Tolomelli (didrocks) wrote :

If you have the upstream bug report for the ZFS driver (not docker itself), that would be welcome. I didn’t find anything in the upstream wiki for the driver which is shipped by default in debian/ubuntu.

Revision history for this message
Didier Roche-Tolomelli (didrocks) wrote :

So yeah, the driver should ideally do the same (create directly under rpool/), but it really depends on how people layed out their system.

I think in general upstream should clear up their datasets (for stopped containers), which is more generic.

Revision history for this message
Tianon Gravi (tianon) wrote :

The "zfs" graph driver is part of Docker itself (https://github.com/moby/moby/tree/89382f2f20745b9e63bed6c066f104980dff4396/daemon/graphdriver/zfs), so https://github.com/moby/moby/issues would be the appropriate place to file an issue (or enhancement request) against it.

Revision history for this message
Didier Roche-Tolomelli (didrocks) wrote :
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

The new postinst failed for me in groovy, here is the bug I filed: https://bugs.launchpad.net/ubuntu/+source/docker.io/+bug/1882942

Revision history for this message
Brian Murray (brian-murray) wrote :

It seems to me like the wiki page referenced in the SRU should mention the version of the package which is being SRU'ed not the version of the package which is in Groovy.

https://github.com/ubuntu/zsys/wiki/Performance-issue-with-docker-on-ubuntu-20.04-LTS

Changed in docker.io (Ubuntu Focal):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-focal
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Jean-Baptiste, or anyone else affected,

Accepted docker.io into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/docker.io/19.03.8-0ubuntu1.20.04 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :

SRU verification for Focal:
I have reproduced the problem with docker.io 19.03.8-0ubuntu1 in focal and have verified that the version of docker.io 19.03.8-0ubuntu1.20.04 in -proposed fixes the issue.

Marking as verification-done

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for docker.io has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package docker.io - 19.03.8-0ubuntu1.20.04

---------------
docker.io (19.03.8-0ubuntu1.20.04) focal; urgency=medium

  [ Jean-Baptiste Lallement ]
  [ Didier Roche ]
  * Fix use with ZFS on root:
    - docker creates one dataset for any layer of containers that were
      created. Create now a <pool>/var/lib/docker for creating them in
      the persistent namespace.
    - don’t migrate existing ones but display a message to the end user on
      ZSys wiki with instrutions.
    The migration only impacts the ubuntu desktop installation with
    experimental ZFS on root. (LP: #1879473)

 -- Didier Roche <email address hidden> Thu, 18 Jun 2020 10:26:54 +0200

Changed in docker.io (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.