s390x load + decompress time of kernel image with lz4 is 2x slower than lzo

Bug #1841193 reported by Dimitri John Ledkov
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Fix Released
Medium
bugproxy
linux (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Canonical Kernel Performance Team has been doing benchmarks on loading & decompressing time of the kernel images to select the fastest compression.

On x86_64 lz4 came out as the best one.

On our s390x load + decompress time of kernel image with lz4 appeared to be 2x slower than lzo. Tested on z13 mainframde, whilst kernel is compiled targetingging -march=zEC12. Which is a bit surprising. Are there any performance improvements that can be done to kernel lz4?

You can see assessment details over at the public bug https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840934

I wonder if IBM teams can try different kernel compression algorithms to doublecheck that indeed load+decompress time of lzo kernel image is the fastest, and faster than lz4 or gzip.

Also, would that change with z14 and the hw accelerated decompression there? Is that implemented for the kernel decompressors?

We will be switching kernel image compression to lzo with the v5.3 kernel on s390x.

affects: ubuntu → linux (Ubuntu)
bugproxy (bugproxy)
tags: added: architecture-s39064 bugnameltc-181056 severity-high targetmilestone-inin1804
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
assignee: nobody → bugproxy (bugproxy)
importance: Undecided → Medium
status: New → Triaged
tags: added: reverse-proxy-bugzilla
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

FWIW, I verified this on z14, and there clearly lz4 is (as expected) the fastest decompression algorithm.

With vanilla 5.3-rc6 and defconfig I get the following kernel uncompression times:
lzo: 27us
lz4: 24us

An initrd (uncompressed size ~55MB) gets these uncompression times:
lzo: 62us
lz4: 49us

So I'd clearly vote to switch to lz4 on s390 as well.

And no: there is no support for using the zEDC card when uncompressing kernel image and/or initrd.

Revision history for this message
bugproxy (bugproxy) wrote :

I see similar behaviour on z13: lz4 is the fastest deompression algorithm.
No idea how you came up that lzo would be better.
Maybe there was steal time involved?
At least from my point of view lz4 should be used, for both kernel and initrd.

Revision history for this message
Colin Ian King (colin-king) wrote :

Note that the measurements were done using the monotonic timer using the stckf opcode to fetch a 64 bit time value before and after the decompression on a s390x instance under KVM, so I guess this may have been giving me inaccurate timings. The code was instrumented in the early boot phase of the kernel, so I was using the in-kernel decompression code.

How was the benchmarking of the decompression achieved for your results?

Revision history for this message
bugproxy (bugproxy) wrote :

I also instrumented the kernel code to only measure the time to decompress the kernel. If its stckf or stcke doesn't matter in this case.
Note that if you shift a tod clock value 12 bits to the right will give you microseconds. (All numbers I posted were actually milliseconds not microseconds by the way).

I measured both runs (z13 + z14) when running within z/VM and IPL'ed from the punch card reader.

Times used for decompressing the initrd were just extracted from dmesg; no kernel instrumentation required here, since there are two messages provided before and after initrd decompression.

Find below an extract of the patch to measure decompression time.

diff --git a/arch/s390/boot/startup.c b/arch/s390/boot/startup.c
index 7b0d054..cee3d97 100644
--- a/arch/s390/boot/startup.c
+++ b/arch/s390/boot/startup.c
@@ -146,7 +146,10 @@ void startup_kernel(void)
}

if (!IS_ENABLED(CONFIG_KERNEL_UNCOMPRESSED)) {
+ start = get_tod_clock();
img = decompress_kernel();
+ end = get_tod_clock();
+ time = (end - start) >> 12;
memmove((void *)vmlinux.default_lma, img, vmlinux.image_size);
} else if (__kaslr_offset)
memcpy((void *)vmlinux.default_lma, img, vmlinux.image_size);

Revision history for this message
Colin Ian King (colin-king) wrote :

OK, thanks for that, I didn't use the get_tod_clock() I used some inlined assembler, but I trust your data and we'll set the lz4 for s390 too.

Revision history for this message
Colin Ian King (colin-king) wrote :

Fix committed:
commit 25b52c773e12a38a94ad0c8a46f99f722d9ed49e
Author: Thadeu Lima de Souza Cascardo <email address hidden>
Date: Mon Sep 9 15:53:54 2019 -0300

    UBUNTU: [Config]: Switch kernel compression from LZO to LZ4 on s390x

    BugLink: https://bugs.launchpad.net/bugs/1840934
    While at it, update the annotations file to match reality.

    Suggested-by: Colin Ian King <email address hidden>
    Signed-off-by: Thadeu Lima de Souza Cascardo <email address hidden>

Changed in linux (Ubuntu):
status: New → Fix Committed
Changed in ubuntu-z-systems:
status: Triaged → Fix Committed
Revision history for this message
Frank Heimes (fheimes) wrote :

Just checked Eoan's master tree and this config patch is in and tagged with Ubuntu-5.3.0-11.12 (and newer) and since there is already 5.3.0.13.14 in Eoan's release pocket:
"linux-generic | 5.3.0.13.14 | eoan | s390x"
I'm closing this ticket as Fix Released.

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released
information type: Private → Public
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-08-30 07:12 EDT-------

------- Comment From <email address hidden> 2019-08-30 07:52 EDT-------

------- Comment From <email address hidden> 2019-08-30 16:25 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-10-08 03:28 EDT-------
IBM bugzilla status -> closed, Fix Released with Eoan

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.