Reliable crash in lowlatency kernel with LXD

Bug #1872964 reported by Mark Shuttleworth
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-signed (Ubuntu)
Confirmed
Undecided
Seth Forshee

Bug Description

Hello, I am able to crash 5.4.0-21-lowlatency running LXD. This machine in an Intel NUC, part of a three-member LXD cluster. The other machines do not show this crash and they are not running the lowlatency kernel, but that might be incidental. I will swap out the generic kernel for a while to see if the behaviour continues there.

Here is the crash:

[ 3222.385724] ------------[ cut here ]------------
[ 3222.385732] WARNING: CPU: 1 PID: 59852 at kernel/rcu/tree_plugin.h:293 rcu_note_context_switch+0x46/0x460
[ 3222.385733] Modules linked in: binfmt_misc veth nft_masq nft_chain_nat vxlan ip6_udp_tunnel udp_tunnel dummy bridge stp llc ebtable_filter ebtables ip6table_raw ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_raw iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bpfilter nf_tables nfnetlink unix_diag ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs pps_ldisc zfs(PO) zunicode(PO) zavl(PO) icp(PO) nls_iso8859_1 zcommon(PO) znvpair(PO) spl(O) dm_multipath zlua(PO) scsi_dh_rdac scsi_dh_emc scsi_dh_alua snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio intel_rapl_msr snd_hda_intel mei_hdcp snd_intel_nhlt snd_hda_codec snd_hda_core snd_hwdep intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_pcm kvm snd_timer intel_cstate mei_me intel_rapl_perf mei snd soundcore mac_hid acpi_pad sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
[ 3222.385779] async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel i915 aesni_intel i2c_algo_bit crypto_simd cryptd glue_helper drm_kms_helper syscopyarea sysfillrect sysimgblt e1000e i2c_i801 fb_sys_fops ahci drm libahci lpc_ich video
[ 3222.385795] CPU: 1 PID: 59852 Comm: systemd-resolve Tainted: P O 5.4.0-21-lowlatency #25-Ubuntu
[ 3222.385796] Hardware name: /NUC5i5MYBE, BIOS MYBDWi5v.86A.0034.2017.0116.2148 01/16/2017
[ 3222.385801] RIP: 0010:rcu_note_context_switch+0x46/0x460
[ 3222.385803] Code: 54 53 48 c7 c3 80 ba 02 00 65 48 03 1d 8b 57 ce 6d 0f 1f 44 00 00 41 8b 85 80 07 00 00 45 84 f6 0f 85 55 02 00 00 85 c0 7e 0c <0f> 0b 41 80 bd 84 07 00 00 00 74 33 4c 89 ef e8 36 fb ff ff 65 66
[ 3222.385804] RSP: 0018:ffffa7bf48e3bcf8 EFLAGS: 00010002
[ 3222.385806] RAX: 0000000000000001 RBX: ffff9817c5caba80 RCX: 0000000000000001
[ 3222.385807] RDX: 0000000000000000 RSI: ffffffff92ce6379 RDI: 0000000000000000
[ 3222.385808] RBP: ffffa7bf48e3bd20 R08: 0000000000000000 R09: 0000000000000000
[ 3222.385808] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9817c5caad00
[ 3222.385809] R13: ffff9816c39ecd00 R14: 0000000000000000 R15: 000000000002ad00
[ 3222.385811] FS: 00007f9c0f530940(0000) GS:ffff9817c5c80000(0000) knlGS:0000000000000000
[ 3222.385812] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3222.385812] CR2: 000055988bb4b080 CR3: 00000003b5e9a005 CR4: 00000000003606e0
[ 3222.385813] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 3222.385814] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 3222.385815] Call Trace:
[ 3222.385823] __schedule+0x9d/0x750
[ 3222.385826] ? ___sys_sendmsg+0x95/0xd0
[ 3222.385829] schedule+0x49/0xd0
[ 3222.385831] schedule_hrtimeout_range_clock+0xf9/0x110
[ 3222.385834] ? __seccomp_filter+0x85/0x6b0
[ 3222.385837] schedule_hrtimeout_range+0x13/0x20
[ 3222.385839] ep_poll+0x3c8/0x410
[ 3222.385843] ? wake_up_q+0x70/0x70
[ 3222.385845] do_epoll_wait+0xb8/0xd0
[ 3222.385847] __x64_sys_epoll_wait+0x1e/0x30
[ 3222.385850] do_syscall_64+0x57/0x190
[ 3222.385852] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3222.385854] RIP: 0033:0x7f9c0f04eb77
[ 3222.385856] Code: 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8d 05 91 ed 2c 00 41 89 ca 8b 00 85 c0 75 18 b8 e8 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 61 f3 c3 0f 1f 80 00 00 00 00 41 55 41 54 41
[ 3222.385857] RSP: 002b:00007ffc6fd86d88 EFLAGS: 00000246 ORIG_RAX: 00000000000000e8
[ 3222.385859] RAX: ffffffffffffffda RBX: 000055988d0ad3c0 RCX: 00007f9c0f04eb77
[ 3222.385859] RDX: 000000000000000e RSI: 00007ffc6fd86d90 RDI: 0000000000000004
[ 3222.385861] RBP: 00007ffc6fd86f40 R08: 00007ffc6fd86d90 R09: 000055988d0b645c
[ 3222.385861] R10: 00000000ffffffff R11: 0000000000000246 R12: 00000000000000cf
[ 3222.385862] R13: ffffffffffffffff R14: 00007ffc6fd86d90 R15: 0000000000000001
[ 3222.385865] ---[ end trace a8d20e83a2bda2fc ]---
[ 3282.386391] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 3282.386412] rcu: Tasks blocked on level-0 rcu_node (CPUs 0-3): P59852
[ 3282.386431] (detected by 0, t=60002 jiffies, g=454773, q=28443)
[ 3282.386435] systemd-resolve S 0 59852 59378 0x80000320
[ 3282.386441] Call Trace:
[ 3282.386457] __schedule+0x2e5/0x750
[ 3282.386465] schedule+0x49/0xd0
[ 3282.386470] schedule_hrtimeout_range_clock+0xf9/0x110
[ 3282.386479] ? __seccomp_filter+0x85/0x6b0
[ 3282.386484] schedule_hrtimeout_range+0x13/0x20
[ 3282.386488] ep_poll+0x3c8/0x410
[ 3282.386496] ? wake_up_q+0x70/0x70
[ 3282.386501] do_epoll_wait+0xb8/0xd0
[ 3282.386505] __x64_sys_epoll_wait+0x1e/0x30
[ 3282.386511] do_syscall_64+0x57/0x190
[ 3282.386517] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3282.386521] RIP: 0033:0x7f9c0f04eb77
[ 3282.386532] Code: Bad RIP value.
[ 3282.386535] RSP: 002b:00007ffc6fd86d88 EFLAGS: 00000246 ORIG_RAX: 00000000000000e8
[ 3282.386539] RAX: ffffffffffffffda RBX: 000055988d0ad3c0 RCX: 00007f9c0f04eb77
[ 3282.386541] RDX: 000000000000000e RSI: 00007ffc6fd86d90 RDI: 0000000000000004
[ 3282.386543] RBP: 00007ffc6fd86f40 R08: 00007ffc6fd86d90 R09: 000055988d0b645c
[ 3282.386545] R10: 00000000ffffffff R11: 0000000000000246 R12: 00000000000000cf
[ 3282.386547] R13: ffffffffffffffff R14: 00007ffc6fd86d90 R15: 0000000000000001
[ 3286.108450] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P59852 } 63716 jiffies s: 4773 root: 0x0/T
[ 3286.108472] rcu: blocking rcu_node structures:

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

Just to confirm and still seeing this with today's -24 package.

Revision history for this message
Seth Forshee (sforshee) wrote :

When you see this in -24, is that still the lowlatency kernel or had you switched it out for generic like you mentioned in the description?

Revision history for this message
Seth Forshee (sforshee) wrote :

Please disregard comment #2. These warnings are related to preempt rcu, which is only used in the lowlatency kernel.

This seems likely to be caused by an unbalanced rcu_read_lock()/rcu_read_unlock() happening somewhere before the task schedules.

Revision history for this message
Seth Forshee (sforshee) wrote :

My attempts to reproduce this have been unsuccessful. Can you give me an idea of how many containers you're running on the machine and what kind of workloads they are running?

Changed in linux-signed (Ubuntu):
assignee: nobody → Seth Forshee (sforshee)
status: New → Confirmed
Revision history for this message
Mark Shuttleworth (sabdfl) wrote : Re: [Bug 1872964] Re: Reliable crash in lowlatency kernel with LXD

Hi Seth

I have been deploying charmed-kubernetes on a LXD cluster. I didn't
think the kernel crash was workload-specific, but of course it might be.
It looked more likely to be ZFS + lowlatency + snaps triggering the issue.

For now I have had to move to -generic on the machine which was causing
the problem. I have ordered some more NUCs and will add -lowlatency
nodes to a cluster when they arrive, to see if different workloads
behave differently.

Mark

Revision history for this message
Seth Forshee (sforshee) wrote :

I'm not sure that it's workload-specific, it could also be an issue with a specific driver or any number of things. RCU is widely used in the kernel, so at this point I'm looking for the proverbial needle in the haystack. I am trying to uncover code paths that you might have been exercising that I was not, as I don't see any issues with the epoll_wait syscall shown in the stack trace.

I'll keep trying to reproduce. If I cannot, once you are ready to try the lowlatency kernel again I can provide a kernel with some RCU debugging options turned on which may help identify the culprit.

Thanks!

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

OK, will circle back to this bug in due course. From what you say it
might even be specific to that Nuc.

Mark

Revision history for this message
Thomas M Steenholdt (tmus) wrote :

I have an LXD Virtual Machine refusing to boot after installing the -lowlatency kernel on it. Sounds like these _might_ be related, so here's my info. I'm on a NUC too, btw.

Host: Ubuntu 20.04 LTS, LXD 4.2 snap (running -lowlatency)
Guest: Ubuntu 20.04 LTS

The guest was running without problems on the -virtual kernel, but refuses to boot after installing -lowlatency

# lxc info --show-log gw2
Name: gw2
Location: none
Remote: unix://
Architecture: x86_64
Created: 2020/06/20 22:18 UTC
Status: Stopped
Type: virtual-machine
Profiles: default
Pid: 326329
Resources:
  Processes: 0

Log:

KVM internal error. Suberror: 1
emulation failure
RAX=0000000000000a00 RBX=000000000b2639e0 RCX=000000000f3c1f40 RDX=0000000000000b20
RSI=0000000000000103 RDI=000000000b24e1c0 RBP=000000000b24e120 RSP=000000000f3c20f0
R8 =000000000c914168 R9 =0000000000000000 R10=ffffffffffffffff R11=000000000af9d800
R12=000000000b24f120 R13=000000000b24e0e0 R14=0000000000020004 R15=000000000c90d483
RIP=00000000000affff RFL=00210246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
CS =0038 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
DS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
FS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
GS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
GDT= 000000000edee698 00000047
IDT= 000000000e80e018 00000fff
CR0=80010033 CR2=0000000000000000 CR3=000000000f001000 CR4=00000668
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d00
Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

Revision history for this message
Thomas M Steenholdt (tmus) wrote :

Kernel version 5.4.0-37-lowlatency, btw

Revision history for this message
Thomas M Steenholdt (tmus) wrote :

So in my case, my VM was configured with only 256MB of memory. This caused no problem on the normal -virtual kernel but caused all sorts of problems when I installed -lowlatency in the guest.

Upgrading to 384MB solved the problem reliably here, so perhaps this is not the same thing afterall.

- Sorry for the interruption

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.