cephfs mounts hangs machine when written to

Bug #1851470 reported by Simon Oosthoek
30
This bug affects 6 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

As reported on the ceph-users mailing list, the latest HWE kernel for ubuntu 18.04 (and the current kernel for disco 19.04) has a problem with cephfs mounts when you write to it. (kernel version 5.0.0-32)

We are using cephfs to write backups to with amanda (virtual tapes) and the backups started failing when the virtual machine was running this version of the kernel.

As it is the latest kernel version and we like to keep up with updates, we have trouble uninstalling the latest kernel version, because then we also need to uninstall the package that depends on the latest hwe kernel.

https://<email address hidden>/msg00940.html
downgrading to (or booting) 5.0.0-31 helps:
https://<email address hidden>/msg00941.html

The problem seems to be this backported patch:
"ceph: use ceph_evict_inode to cleanup inode's resource"

The severity is high when using cephfs mounted while running this kernel, the file system is unusable.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1851470

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Simon Oosthoek (simon-margo) wrote :

kern.log when the bug triggers

Revision history for this message
Simon Oosthoek (simon-margo) wrote :

The machine is hanging when the bug has been triggered, so I cannot use apport.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Jasper Spaans (jap171) wrote :

We're seeing a similar crash on a 16.04 based machine running 4.15.0-66-generic ; downgrading to 4.15.0-51-generic appears to prevent it.

Appended kernel log with very similar entries.

Revision history for this message
Kees Hoekzema (kees-r) wrote :

The problem as far as i can tell is that https://lkml.org/lkml/2019/10/3/862 this revert never got applied to those kernels. Which makes them unstable.

5.0.0-32+ and 4.15.0-66+ are effected

Revision history for this message
Simon Oosthoek (simon-margo) wrote :

This problem appears to be gone in 5.0.0-36

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.