Performance degradation of "zfs clone" when under load

Bug #1567557 reported by Stéphane Graber on 2016-04-07
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
zfs-linux (Ubuntu)
Medium
Colin Ian King

Bug Description

I've been running some scale tests for LXD and what I've noticed is that "zfs clone" gets slower and slower as the zfs filesystem is getting busier.

It feels like "zfs clone" requires some kind of pool-wide lock or something and so needs for all operations to complete before it can clone a new filesystem.

A basic LXD scale test with btrfs vs zfs shows what I mean, see below for the reports.

The test is run on a completely dedicated physical server with the pool on a dedicated SSD, the exact same machine and SSD was used for the btrfs test.

The zfs filesystem is configured with those settings:
 - relatime=on
 - sync=disabled
 - xattr=sa

So it shouldn't be related to pending sync() calls...

The workload in this case is ultimately 1024 containers running busybox as their init system and udhcpc grabbing an IP.
The problem gets significantly worse if spawning busier containers, say a full Ubuntu system.

=== zfs ===
root@edfu:~# /home/ubuntu/lxd-benchmark spawn --count=1024 --image=images:alpine/edge/amd64 --privileged=true
Test environment:
  Server backend: lxd
  Server version: 2.0.0.rc8
  Kernel: Linux
  Kernel architecture: x86_64
  Kernel version: 4.4.0-16-generic
  Storage backend: zfs
  Storage version: 5
  Container backend: lxc
  Container version: 2.0.0.rc15

Test variables:
  Container count: 1024
  Container mode: privileged
  Image: images:alpine/edge/amd64
  Batches: 128
  Batch size: 8
  Remainder: 0

[Apr 3 06:42:51.170] Importing image into local store: 64192037277800298d8c19473c055868e0288b039349b1c6579971fe99fdbac7
[Apr 3 06:42:52.657] Starting the test
[Apr 3 06:42:53.994] Started 8 containers in 1.336s
[Apr 3 06:42:55.521] Started 16 containers in 2.864s
[Apr 3 06:42:58.632] Started 32 containers in 5.975s
[Apr 3 06:43:05.399] Started 64 containers in 12.742s
[Apr 3 06:43:20.343] Started 128 containers in 27.686s
[Apr 3 06:43:57.269] Started 256 containers in 64.612s
[Apr 3 06:46:09.112] Started 512 containers in 196.455s
[Apr 3 06:58:19.309] Started 1024 containers in 926.652s
[Apr 3 06:58:19.309] Test completed in 926.652s

=== btrfs ===
Test environment:
  Server backend: lxd
  Server version: 2.0.0.rc8
  Kernel: Linux
  Kernel architecture: x86_64
  Kernel version: 4.4.0-16-generic
  Storage backend: btrfs
  Storage version: 4.4
  Container backend: lxc
  Container version: 2.0.0.rc15

Test variables:
  Container count: 1024
  Container mode: privileged
  Image: images:alpine/edge/amd64
  Batches: 128
  Batch size: 8
  Remainder: 0

[Apr 3 07:42:12.053] Importing image into local store: 64192037277800298d8c19473c055868e0288b039349b1c6579971fe99fdbac7
[Apr 3 07:42:13.351] Starting the test
[Apr 3 07:42:14.793] Started 8 containers in 1.442s
[Apr 3 07:42:16.495] Started 16 containers in 3.144s
[Apr 3 07:42:19.881] Started 32 containers in 6.530s
[Apr 3 07:42:26.798] Started 64 containers in 13.447s
[Apr 3 07:42:42.048] Started 128 containers in 28.697s
[Apr 3 07:43:13.210] Started 256 containers in 59.859s
[Apr 3 07:44:26.238] Started 512 containers in 132.887s
[Apr 3 07:47:30.708] Started 1024 containers in 317.357s
[Apr 3 07:47:30.708] Test completed in 317.357s

Richard Laager (rlaager) wrote :

Since you've already filed one bug report upstream, would you be interested in filing this one upstream? I can certainly copy-and-paste it upstream, but it seems like it'd be better to have it come from you. I don't know anything about LXD. (I'm just trying to help out with ZoL bug reports.)

Changed in zfs-linux (Ubuntu):
importance: Undecided → Medium
assignee: nobody → Colin Ian King (colin-king)
status: New → In Progress
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers