Kernel crashes from time to time when using ftrace
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Medium
|
Unassigned |
Bug Description
While performing some tracing suing ftrace-cmd I came across the following OOPS:
[ 333.051723] invalid opcode: 0000 [#1] SMP
[ 333.051742] Modules linked in: drbg ansi_cprng ctr ccm xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_
[ 333.051972] edac_core soundcore mei_me mei 8250_fintek mac_hid kvm_intel ip6t_REJECT nf_reject_ipv6 kvm nf_log_ipv6 irqbypass xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_
[ 333.052206] pps_core fjes video
[ 333.052216] CPU: 1 PID: 5616 Comm: trace-cmd Not tainted 4.4.0-31-generic #50-Ubuntu
[ 333.052235] Hardware name: Dell Inc. Precision T1650/0X9M3X, BIOS A15 09/09/2013
[ 333.052254] task: ffff8804066b1b80 ti: ffff88040b474000 task.ti: ffff88040b474000
[ 333.052272] RIP: 0010:[<
[ 333.052296] RSP: 0018:ffff88040b
[ 333.052309] RAX: 0000000000000000 RBX: ffff8800d9a4ec00 RCX: ffff88040b477f18
[ 333.052326] RDX: 0000000000002000 RSI: 000000000237d690 RDI: ffff8800d9a4ec00
[ 333.052343] RBP: ffff88040b477f48 R08: 00007f89df102cf8 R09: 0000000000000021
[ 333.052360] R10: 000000000000000d R11: 0000000000000246 R12: ffff8800d9a4ec00
[ 333.052377] R13: 000000000237d690 R14: 0000000000002000 R15: 000000000237d690
[ 333.052395] FS: 00007f89df50f70
[ 333.052414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 333.052428] CR2: 0000000000a78d88 CR3: 00000003c3542000 CR4: 00000000001406e0
[ 333.052445] Stack:
[ 333.052451] ffffffff8120d165 ffffffff8120df35 00007fff364487e5 0000000000000000
[ 333.052473] 00000000eee5d05c 0000000000000000 0000000000000000 0000000000000006
[ 333.052494] 0000000000000008 0000000000002000 ffffffff8182db32 0000000000000004
[ 333.052515] Call Trace:
[ 333.052525] [<ffffffff8120d
[ 333.052538] [<ffffffff8120d
[ 333.052553] [<ffffffff8182d
[ 333.052568] Code: 8b 44 24 48 48 8b 7c 24 70 48 8b 74 24 68 48 8b 54 24 60 48 8b 4c 24 58 48 8b 44 24 50 48 8b 6c 24 20 48 81 c4 d0 00 00 00 e9 fd <ff> ff ff 80 00 00 00 00 9c 55 ff 74 24 18 55 48 89 e5 ff 74 24
[ 333.052685] RIP [<ffffffff81830
[ 333.052700] RSP <ffff88040b477f00>
All code
========
0: 8b 44 24 48 mov 0x48(%rsp),%eax
4: 48 8b 7c 24 70 mov 0x70(%rsp),%rdi
9: 48 8b 74 24 68 mov 0x68(%rsp),%rsi
e: 48 8b 54 24 60 mov 0x60(%rsp),%rdx
13: 48 8b 4c 24 58 mov 0x58(%rsp),%rcx
18: 48 8b 44 24 50 mov 0x50(%rsp),%rax
1d: 48 8b 6c 24 20 mov 0x20(%rsp),%rbp
22: 48 81 c4 d0 00 00 00 add $0xd0,%rsp
29:* e9 fd ff ff ff jmpq 0x2b <-- trapping instruction
2e: 80 00 00 addb $0x0,(%rax)
31: 00 00 add %al,(%rax)
33: 9c pushfq
34: 55 push %rbp
35: ff 74 24 18 pushq 0x18(%rsp)
39: 55 push %rbp
3a: 48 89 e5 mov %rsp,%rbp
3d: ff .byte 0xff
3e: 74 24 je 0x64
Code starting with the faulting instruction
=======
0: ff (bad)
1: ff (bad)
2: ff 80 00 00 00 00 incl 0x0(%rax)
8: 9c pushfq
9: 55 push %rbp
a: ff 74 24 18 pushq 0x18(%rsp)
e: 55 push %rbp
f: 48 89 e5 mov %rsp,%rbp
12: ff .byte 0xff
13: 74 24 je 0x39
The way I was running trace-cmd was:
trace-cmd stream -p function -l vfs_read -F ls
But the same crash occured if I ran 'trace-cmd record -p function -l vfs_read -F ls'
What's interesting is this doesn't happen always but will usually occur one out of 10 times or so. Apparently it goes bogus in the mcount handler:
addr2line -e /vmlinux ffffffff818302a8
/build/
I managed to also capture a complete kernel crashdump so if you need any other relevant information (diassembly of relvant function) I'm happy to provide it.
This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:
apport-collect 1605843
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.