Ubuntu
linux-signed-hwe package

Writeback not flushing to disk in 4.15.0-137-generic and above

Bug #1922466 reported by Christoph Dwertmann on 2021-04-04

This bug report is a duplicate of: Bug #1926808: Bionic update: upstream stable patchset 2021-04-30. Edit Remove

This bug affects 4 people

Affects		Status	Importance	Assigned to	Milestone
	linux-signed-hwe (Ubuntu)	Confirmed	Undecided	Unassigned

Bug Description

Hi!

We've come across some interesting behaviour in kernel 4.15.0-137.141~16.04.1 and above.

After booting a fresh Ubuntu 16.04 instance on AWS, we replace the AWS kernel with "linux-image-4.15.0-140-generic" (4.15.0-140.144~16.04.1) and reboot. Then we generate some I/O by running fio for a while:

fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=64k --size=256m --numjobs=16 --iodepth=16 --runtime=3600 --time_based --end_fsync=1

It does't matter whether fio is run against the boot disk or an attached secondary disk. After stopping fio we notice that some pages are stuck in "writeback" and are apparently not flushing to disk:

# lsb_release -rd
Description: Ubuntu 16.04.7 LTS
Release: 16.04
# cat /proc/vmstat | grep "nr_writeback "
nr_writeback 80
# cat /proc/meminfo | grep Writeback:
Writeback: 320 kB

This doesn't clear, not even days later. Running more fio only increases the amount of writeback pages.

Downgrading the kernel to 4.15.0-136.140~16.04.1 resolves the issue, no writeback pages getting stuck. Going over the kernel changelog, I can see that between -136 and -137 the following patchset was applied, but I'm not sure whether it is related to the issue: https://www.spinics.net/lists/stable/msg435893.html

Kernels 4.15.0-137-generic and above took down our Ceph cluster, because it seems that when the amount of "writeback" reaches the buffer ceiling of "dirty_bytes", all subsequent writes to the disk are incredibly slow. This is from an idle production system (not on AWS) running 16.04 with kernel 4.15.0-139-generic:

# lsb_release -rd
Description: Ubuntu 16.04.4 LTS
Release: 16.04
# cat /proc/sys/vm/dirty_bytes
629145600
# cat /proc/sys/vm/dirty_background_bytes
314572800
# cat /proc/meminfo | grep Writeback:
Writeback: 572108 kB
# dd if=/dev/zero of=/test bs=1M count=10; rm /test
10+0 records in
10+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 126.529 s, 82.9 kB/s

Could there be a bug in kernel 4.15.0-137-generic and above?

Thank you!
Kind regards,

Christoph Dwertmann

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-4.15.0-140-generic 4.15.0-140.144~16.04.1
ProcVersionSignature: User Name 4.15.0-140.144~16.04.1-generic 4.15.18
Uname: Linux 4.15.0-140-generic x86_64
ApportVersion: 2.20.1-0ubuntu2.30
Architecture: amd64
Date: Sun Apr 4 03:39:25 2021
Ec2AMI: ami-041e1cc8f4c429789
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: ap-southeast-2c
Ec2InstanceType: c5ad.xlarge
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
ProcEnviron:
TERM=xterm-256color
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=en_US.UTF-8
SHELL=/bin/bash
SourcePackage: linux-signed-hwe
UpgradeStatus: No upgrade log present (probably fresh install)

Tags:

Revision history for this message

Christoph Dwertmann (cdwertmann) wrote on 2021-04-04:

Dependencies.txt Edit (2.9 KiB, text/plain; charset="utf-8")
ProcCpuinfoMinimal.txt Edit (1.1 KiB, text/plain; charset="utf-8")

Revision history for this message

Christoph Dwertmann (cdwertmann) wrote on 2021-04-15:

I'd like to add that this bug also affects 18.04 LTS (Bionic) as it uses the same kernel.

Revision history for this message

Launchpad Janitor (janitor) wrote on 2021-04-16:

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-signed-hwe (Ubuntu):
status:	New → Confirmed

Report a bug

This report contains Public information

Everyone can see this information.

Duplicate of bug #1926808 Remove

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Ubuntulinux-signed-hwe package

Writeback not flushing to disk in 4.15.0-137-generic and above

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux-signed-hwe package