NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [docker:28587]

Bug #1483770 reported by Michael Salinger on 2015-08-11
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux-lts-vivid (Ubuntu)
Undecided
Unassigned

Bug Description

With 14.04.2 and the vivid backported kernel, we are seeing occasional crashes (soft lockups) when using Docker.

kernel: [631576.200053] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [docker:28587]
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] Modules linked in: veth xt_nat xt_tcpudp tcp_diag inet_diag xt_addrtype xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables x_tables nf_nat nf_conntrack bridge stp llc overlay dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c dm_crypt ppdev xen_fbfront syscopyarea sysfillrect sysimgblt fb_sys_fops serio_raw parport_pc parport isofs crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse floppy
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] CPU: 0 PID: 28587 Comm: docker Not tainted 3.19.0-25-generic #26~14.04.1-Ubuntu
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/06/2015
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] task: ffff8800e88a3ae0 ti: ffff88002ec84000 task.ti: ffff88002ec84000
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] RIP: 0010:[<ffffffff813af79d>] [<ffffffff813af79d>] format_decode+0x21d/0x3e0
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] RSP: 0018:ffff88002ec87ce8 EFLAGS: 00000246
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] RAX: ffffffff81ab03aa RBX: ffffffff813b02bc RCX: 0000000000000075
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] RDX: 0000000000000075 RSI: 0000000000000075 RDI: ffffffff81ab03ab
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] RBP: ffff88002ec87d08 R08: 000000000000000a R09: 000000000000fffe
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] R10: 0000000000000000 R11: ffff88002ec87c8e R12: ffff88002ecba000
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] R13: ffff88002ecb9004 R14: ffffffff813af362 R15: ffff88002ec87c58
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] FS: 00000000015fa880(0063) GS:ffff8800efc00000(0000) knlGS:0000000000000000
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] CR2: 000000c208109000 CR3: 00000000290e6000 CR4: 00000000001406f0
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] Stack:
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] ffff88002ecba000 ffffffff81ab03ab ffff88002ecb900a ffff88002ecba000
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] ffff88002ec87d88 ffffffff813b20d8 ffff88002ec8ffff 0000000000000000
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] 000000000000fffd 0000000000000001 ffff88002ecb9000 0000000000001000
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] Call Trace:
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff813b20d8>] vsnprintf+0x58/0x590
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff81210725>] seq_vprintf+0x35/0x70
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff8121079d>] seq_printf+0x3d/0x40
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff81210ea8>] ? seq_path_root+0x68/0xf0
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff8122a35b>] show_mountinfo+0x5b/0x290
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff8120adf6>] m_show+0x16/0x20
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff8121005a>] seq_read+0xea/0x370
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff811ecc38>] __vfs_read+0x18/0x50
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff811eccf6>] vfs_read+0x86/0x140
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff811ecdf6>] SyS_read+0x46/0xb0
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff817b668d>] system_call_fastpath+0x16/0x1b
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] Code: 01 08 e9 57 ff ff ff 0f 1f 80 00 00 00 00 ba ff ff ff ff 66 89 53 04 0f b6
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [docker:28587]
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] Modules linked in: veth xt_nat xt_tcpudp tcp_diag inet_diag xt_addrtype xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables x_tables nf_nat nf_conntrack bridge stp llc overlay dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c dm_crypt ppdev xen_fbfront syscopyarea sysfillrect sysimgblt fb_sys_fops serio_raw parport_pc parport isofs crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse floppy
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] CPU: 0 PID: 28587 Comm: docker Tainted: G L 3.19.0-25-generic #26~14.04.1-Ubuntu
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/06/2015
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] task: ffff8800e88a3ae0 ti: ffff88002ec84000 task.ti: ffff88002ec84000
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] RIP: 0010:[<ffffffff813b24c0>] [<ffffffff813b24c0>] vsnprintf+0x440/0x590
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] RSP: 0018:ffff88002ec87d18 EFLAGS: 00000293
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] RAX: 0000000000000002 RBX: 0000000000000031 RCX: 0000000000000002
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] RDX: 0000000000000008 RSI: 00000000ffffffff RDI: ffffffff81ab03a5
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] RBP: ffff88002ec87d88 R08: 000000000000000a R09: 000000000000fffc
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] R10: 0000000000000000 R11: ffff88002ec87c8e R12: 0000000000000000
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] R13: ffff88002ec87c8e R14: 0000000a0000ffff R15: ffff88002ec87c8e
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] FS: 00000000015fa880(0063) GS:ffff8800efc00000(0000) knlGS:0000000000000000
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] CR2: 000000c208109000 CR3: 00000000290e6000 CR4: 00000000001406f0
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] Stack:
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] ffff88002ecbffff ffff88002ec87d7c 000000002ec87d80 0000000000000001
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] ffff88002ecb9000 0000000000001000 ffff88002ec87dc0 ffffffffff0a0210
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] 00000fef2ec87db8 ffff88002ecfcd80 ffff88002ecfcd80 ffff88000d55e6c8
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] Call Trace:
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff81210725>] seq_vprintf+0x35/0x70
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff8121079d>] seq_printf+0x3d/0x40
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff81210ea8>] ? seq_path_root+0x68/0xf0
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff8122a35b>] show_mountinfo+0x5b/0x290
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff8120adf6>] m_show+0x16/0x20
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff8121005a>] seq_read+0xea/0x370
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff811ecc38>] __vfs_read+0x18/0x50
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff811eccf6>] vfs_read+0x86/0x140
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff811ecdf6>] SyS_read+0x46/0xb0
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff817b668d>] system_call_fastpath+0x16/0x1b
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] Code: b6 4d c9 eb 8b 41 8b 55 00 0f b6 4d c9 83 fa 30 0f 83 2b 01 00 00 89 d0 49 03 45 10 83 c2 08 41 89 55 00 48 8b 10 e9 65 ff ff ff <41> 8b 55 00 83 fa 30 0f 83 e7 00 00 00 89 d0 49 03 45 10 83 c2
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] INFO: rcu_sched self-detected stall on CPU { 0} (t=15000 jiffies g=15980250 c=15980249 q=0)
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] Task dump for CPU 0:
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] docker R running task 0 28587 7322 0x00000008
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] ffffffff81c55fc0 ffff8800efc03be8 ffffffff810a00f6 0000000000000000
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] ffffffff81c55fc0 ffff8800efc03c08 ffffffff810a371d 0000000000000087
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] 0000000000000001 ffff8800efc03c38 ffffffff810d3db0 ffff8800efc14bc0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] Call Trace:
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] <IRQ> [<ffffffff810a00f6>] sched_show_task+0xb6/0x130
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810a371d>] dump_cpu_task+0x3d/0x50
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810d3db0>] rcu_dump_cpu_stacks+0x90/0xd0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810d7c6c>] rcu_check_callbacks+0x42c/0x670
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810a4751>] ? account_process_tick+0x61/0x180
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810dcba9>] update_process_times+0x39/0x60
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810ec0d5>] tick_sched_handle.isra.16+0x25/0x60
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810ec154>] tick_sched_timer+0x44/0x80
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810dd847>] __run_hrtimer+0x77/0x1d0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810ec110>] ? tick_sched_handle.isra.16+0x60/0x60
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810ddc27>] hrtimer_interrupt+0xe7/0x220
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8100a7de>] xen_timer_interrupt+0x2e/0x150
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff814c588a>] ? add_interrupt_randomness+0x3a/0x1e0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810cbf6e>] handle_irq_event_percpu+0x3e/0x1a0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810cf5ed>] handle_percpu_irq+0x3d/0x60
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810cb55b>] generic_handle_irq+0x2b/0x40
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8147c08f>] evtchn_2l_handle_events+0x24f/0x260
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff81078f6c>] ? __do_softirq+0x16c/0x270
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8147985f>] __xen_evtchn_do_upcall+0x4f/0x90
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8147b5e4>] xen_evtchn_do_upcall+0x34/0x50
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff817b858d>] xen_hvm_callback_vector+0x6d/0x80
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] <EOI> [<ffffffff813b02bc>] ? number.isra.2+0x35c/0x3a0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff813af6e2>] ? format_decode+0x162/0x3e0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff813b20d8>] vsnprintf+0x58/0x590
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff81210725>] seq_vprintf+0x35/0x70
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8121079d>] seq_printf+0x3d/0x40
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff81210ea8>] ? seq_path_root+0x68/0xf0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8122a35b>] show_mountinfo+0x5b/0x290
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8120adf6>] m_show+0x16/0x20
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8121005a>] seq_read+0xea/0x370
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff811ecc38>] __vfs_read+0x18/0x50
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff811eccf6>] vfs_read+0x86/0x140
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff811ecdf6>] SyS_read+0x46/0xb0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff817b668d>] system_call_fastpath+0x16/0x1b

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.19.0-25-generic 3.19.0-25.26~14.04.1
ProcVersionSignature: Ubuntu 3.19.0-25.26~14.04.1-generic 3.19.8-ckt2
Uname: Linux 3.19.0-25-generic x86_64
ApportVersion: 2.14.1-0ubuntu3.11
Architecture: amd64
Date: Tue Aug 11 14:29:36 2015
Ec2AMI: ami-cc3b3ea4
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1a
Ec2InstanceType: c3.large
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-lts-vivid
UpgradeStatus: No upgrade log present (probably fresh install)

Michael Salinger (mjsalinger) wrote :
Michael Salinger (mjsalinger) wrote :

Anyone have a chance to look at this?

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-lts-vivid (Ubuntu):
status: New → Confirmed
Akihiro Suda (suda-kyoto) wrote :

Recently this was also reported in Docker's issue tracker: https://github.com/docker/docker/issues/19758
This seems to be a bug of overlay.

I can easily reproduce the bug by running `for f in $(seq 1 1000);do docker run -it --rm ubuntu echo $f; done` concurrently in 3 terminals.

(Still not sure whether this is related to concurrency, but I could not reproduce when ran 10k times in a single terminal)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers