kernel panic -not syncing: Fatal exception: panic_on_oops
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu on IBM z Systems |
Fix Released
|
High
|
Canonical Kernel Team | ||
linux (Ubuntu) |
Fix Released
|
High
|
Skipper Bug Screeners | ||
Xenial |
Fix Released
|
High
|
Stefan Bader | ||
Zesty |
Fix Released
|
High
|
Stefan Bader |
Bug Description
SRU justification:
Impact: A race in context flushing is causing a kernel panic on the s390x architecture.
Fix: Using a set of 3 patches (all restricted to arch code), one already upstream and the other 2 pending on linux-next. Regression risk should be low (limited to arch code and tested).
Testcase: see below
---
== Comment: #0 - QI YE <email address hidden> - 2017-08-02 04:11:25 ==
---Problem Description---
Ubuntu got kernel panic
---uname output---
#110-Ubuntu SMP Tue Jul 18 12:56:43 UTC 2017 s390x s390x s390x GNU/Linux
---Debugger Data---
PID: 10991 TASK: 19872a0e8 CPU: 2 COMMAND: "hyperkube"
LOWCORE INFO:
-psw : 0x0004c00180000000 0x0000000000115fa6
-function : pcpu_delegate at 115fa6
-prefix : 0x7fe42000
-cpu timer: 0x7ffab2827828aa50
-clock cmp: 0xd2eb8b31445e4200
-general registers:
0x0004e001
0x0000c001
0x00000000
0x00000000
0x00000001
0x00000001
0x00000000
0x00000000
-access registers:
0x000003ff 0x7ffff910 0000000000 0000000000
0000000000 0000000000 0000000000 0000000000
0000000000 0000000000 0000000000 0000000000
0000000000 0000000000 0000000000 0000000000
-control registers:
0x00000000
0x00000000
0x00000000
0x00000000
0000000000
0000000000
0000000000
0x00000000
-floating point registers:
0x409c7e25
0000000000
0x3ff00000
0x3fef218f
0x00000000
0x000003ff
0x00000000
0000000000
#0 [8380fc78] smp_find_
#1 [8380fc90] machine_kexec at 1135d4
#2 [8380fcb8] crash_kexec at 1fbb8a
#3 [8380fd88] panic at 27d0e0
#4 [8380fe28] die at 1142cc
#5 [8380fe90] do_low_address at 12215e
#6 [8380fea8] pgm_check_handler at 7c2ab4
PSW: 0705200180000000 000002aa267e0e42 (user space)
GPRS: 0000000000000000 0000000000000000 000002aa2c4fd690 0000000000000001
Contact Information = Chee Ye / <email address hidden>
Stack trace output:
no
Oops output:
[43200.761465] docker0: port 10(vethb9132e9) entered forwarding state
[50008.560926] hrtimer: interrupt took 1698076 ns
[123483.768984] systemd[1]: apt-daily.timer: Adding 7h 34min 22.582204s random time.
[123483.930058] systemd[1]: apt-daily.timer: Adding 2h 18min 14.857162s random time.
[123484.064879] systemd[1]: apt-daily.timer: Adding 10h 46min 2.301756s random time.
[123484.824760] systemd[1]: apt-daily.timer: Adding 6h 16min 22.178655s random time.
[153113.703126] conntrack: generic helper won't handle protocol 47. Please consider loading the specific helper module.
[477085.704538] Low-address protection: 0004 ilc:2 [#1] SMP
[477085.704551] Modules linked in: xt_physdev veth xt_recent xt_comment xt_mark xt_nat ipt_MASQUERADE nf_nat_
[477085.705522] CPU: 2 PID: 10991 Comm: hyperkube Not tainted 4.4.0-87-generic #110-Ubuntu
[477085.705525] task: 000000019872a0e8 ti: 000000008380c000 task.ti: 000000008380c000
[477085.705529] User PSW : 0705200180000000 000002aa267e0e42
[477085.705532] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 EA:3
[477085.705539] 000002aa2c4fd690 000003ff7fffee38 0000000000000000 0000000000000002
[477085.705553] 0000000000029c0f 000000c42001ea00 0000000000000001 0000000000000001
[477085.705554] 000000c42001c5c8 000000c42082c1a0 000002aa2666325e 000003ff7fffed90
[477085.705578] User Code: 000002aa267e0e30: e340f0080004 lg %r4,8(%r15)
[477085.705596] Last Breaking-
[477085.705599] [<000002aa26663
[477085.705600]
[477085.705602] Kernel panic - not syncing: Fatal exception: panic_on_oops
System Dump Location:
There are 4 vCPU defined. I can see hyperkube executed on two CPUs and then got kernel panic. It may be related to the TLB entry flush on the two CPUs.
CPU 0 RUNQUEUE: 1ea5a8c00
CURRENT: PID: 0 TASK: bb1528 COMMAND: "swapper/0"
RT PRIO_ARRAY: 1ea5a8db0
[no tasks queued]
CFS RB_ROOT: 1ea5a8c98
[no tasks queued]
CPU 1 RUNQUEUE: 1ea5b9c00
CURRENT: PID: 0 TASK: 1e94162b8 COMMAND: "swapper/1"
RT PRIO_ARRAY: 1ea5b9db0
[no tasks queued]
CFS RB_ROOT: 1ea5b9c98
[120] PID: 23421 TASK: 1c9368af8 COMMAND: "PipelineService"
[120] PID: 10957 TASK: 1987336d8 COMMAND: "hyperkube"
CPU 2 RUNQUEUE: 1ea5cac00
CURRENT: PID: 10991 TASK: 19872a0e8 COMMAND: "hyperkube"
RT PRIO_ARRAY: 1ea5cadb0
[no tasks queued]
CFS RB_ROOT: 1ea5cac98
[no tasks queued]
CPU 3 RUNQUEUE: 1ea5dbc00
CURRENT: PID: 10975 TASK: 198a30000 COMMAND: "hyperkube"
RT PRIO_ARRAY: 1ea5dbdb0
[no tasks queued]
CFS RB_ROOT: 1ea5dbc98
[120] PID: 21614 TASK: 1cbee57c0 COMMAND: "IngestServiceCl"
== Comment: #1 - QI YE <email address hidden> - 2017-08-02 04:20:02 ==
The problem happened randomly. Not pattern has been figured out yet.
It also happens on below kernel levels.
- 4.4.0-78-generic #99
- 4.4.0-83-generic
== Comment: #2 - Heinz-Werner Seeck <email address hidden> - 2017-08-02 08:25:06 ==
@QI YE: Please provide the use case of this problem report. And add dumps and dbginfo , sosreports as attachment. For me it is not clear which use case this problems generates.
Many thanks in advance
== Comment: #3 - QI YE <email address hidden> - 2017-08-02 08:44:01 ==
(In reply to comment #2)
> @QI YE: Please provide the use case of this problem report. And add dumps
> and dbginfo , sosreports as attachment. For me it is not clear which use
> case this problems generates.
> Many thanks in advance
Heinz-Werner, what do you mean by "use case"? Could you elaborate it? If you are referring to what application caused this problem. We have machine learning running on Ubuntu on the IBM Z community cloud.
The dump file is big, any suggestion of the location to upload the dump file?
== Comment: #4 - QI YE <email address hidden> - 2017-08-02 08:50:32 ==
sosreport
CVE References
affects: | kernel-package (Ubuntu) → linux (Ubuntu) |
Changed in ubuntu-power-systems: | |
importance: | Undecided → Critical |
assignee: | nobody → Canonical Kernel Team (canonical-kernel-team) |
tags: | added: kernel-da-key |
Changed in ubuntu-power-systems: | |
status: | New → In Progress |
Changed in ubuntu-power-systems: | |
status: | In Progress → Incomplete |
Changed in linux (Ubuntu): | |
status: | New → Incomplete |
Changed in ubuntu-power-systems: | |
status: | Incomplete → New |
Changed in linux (Ubuntu): | |
status: | Incomplete → New |
Changed in ubuntu-z-systems: | |
assignee: | nobody → Canonical Kernel Team (canonical-kernel-team) |
no longer affects: | ubuntu-power-systems |
Changed in linux (Ubuntu): | |
importance: | Undecided → High |
Changed in linux (Ubuntu): | |
status: | New → In Progress |
Changed in ubuntu-z-systems: | |
status: | New → In Progress |
importance: | Undecided → High |
description: | updated |
Changed in linux (Ubuntu Xenial): | |
assignee: | nobody → Stefan Bader (smb) |
importance: | Undecided → High |
status: | New → In Progress |
Changed in linux (Ubuntu Zesty): | |
assignee: | nobody → Stefan Bader (smb) |
importance: | Undecided → High |
status: | New → In Progress |
Changed in linux (Ubuntu Xenial): | |
status: | In Progress → Fix Committed |
Changed in ubuntu-z-systems: | |
status: | Incomplete → In Progress |
Changed in linux (Ubuntu): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Zesty): | |
status: | In Progress → Fix Committed |
Changed in ubuntu-z-systems: | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu): | |
status: | Fix Committed → Fix Released |
Changed in ubuntu-z-systems: | |
status: | Fix Committed → Fix Released |
Default Comment by Bridge