[LTCTest][Opal][FW860.20] HMI recoverable errors failed to recover and system goes to dump state.
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
The Ubuntu-power-systems project |
Fix Released
|
High
|
Canonical Kernel Team | ||
linux (Ubuntu) |
Fix Released
|
High
|
Manoj Iyer | ||
Zesty |
Fix Released
|
High
|
Unassigned |
Bug Description
== Comment: #0 - Pridhiviraj Paidipeddi <email address hidden> - 2017-04-17 06:08:41 ==
---Problem Description---
HMI Recoverable error injection tests leads to system checkstop followed by system dump with ubuntu 17.04 os and kernel 4.10.0-19-generic ppc64le
Contact Information = <email address hidden>
---uname output---
#21-Ubuntu SMP Thu Apr 6 17:03:05 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux
Machine Type = PowerNV 8284-22A
---System Hang---
System is in dumping state. after dump finishes system will IPL to OS again.
---Debugger---
A debugger is not configured
== Comment: #3 - Pridhiviraj Paidipeddi <email address hidden> - 2017-04-17 06:12:51 ==
# uname -a
#21-Ubuntu SMP Thu Apr 6 17:03:05 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux
# cat /etc/os-release
NAME="Ubuntu"
VERSION="17.04 (Zesty Zapus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 17.04"
VERSION_ID="17.04"
HOME_URL="https:/
SUPPORT_URL="https:/
BUG_REPORT_URL="https:/
PRIVACY_
VERSION_
UBUNTU_
root@p8wookie:~#
== Comment: #4 - Kevin W. Rudd <email address hidden> - 2017-04-17 11:10:22 ==
== Comment: #5 - MAHESH J. SALGAONKAR <email address hidden> - 2017-04-17 13:34:03 ==
it looks like below commit is a culprit:
=======
commit 2337d207288f163
Author: Nicholas Piggin <email address hidden>
Date: Fri Jan 27 14:24:33 2017 +1000
powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts
The branch from hmi_exception_early to hmi_exception_
a "relocatable-style" branch, because it is branching from unrelocated
exception code to beyond __end_interrupts.
Signed-off-by: Nicholas Piggin <email address hidden>
Signed-off-by: Michael Ellerman <email address hidden>
=======
With the above commit changes now hmi_exception_
-------
c000000000025f50 <hmi_exception_
c000000000025f50: 3a 01 4c 3c addis r2,r12,314
c000000000025f54: b0 01 42 38 addi r2,r2,432
c000000000025f58: a6 02 08 7c mflr r0
-------
With above commit the hmi_exception_
If we revert above commit the code jumps to c000000000025f58 (hmi_exception_
After reverting above patch I don't see this issue anymore. I have rebuilt the ubuntu kernel after reverting above patch and you can find the kernel rpm at:
Can you please retry your tests with above kernel and see if issue still persists.
== Comment: #6 - MAHESH J. SALGAONKAR <email address hidden> - 2017-04-17 23:02:31 ==
Spoke to Michael Ellerman this morning. He helped me to identify the root cause and a fix patch beow:
diff --git a/arch/
index 857bf7c5b946.
--- a/arch/
+++ b/arch/
@@ -982,7 +982,7 @@ TRAMP_REAL_
EXCEPTION_
EXCEPTION_
addi r3,r1,STACK_
- BRANCH_
+ BRANCH_
/* Windup the stack. */
/* Move original HSRR0 and HSRR1 into the respective regs */
ld r9,_MSR(r1)
== Comment: #7 - Pridhiviraj Paidipeddi <email address hidden> - 2017-04-18 01:52:03 ==
== Comment: #8 - Pridhiviraj Paidipeddi <email address hidden> - 2017-04-18 01:53:57 ==
Hi Mahesh
Tested all the HMI Recoverable errors on the below patched kernel, attached the corresponding executing logs. All tests are working fine.
#21 SMP Mon Apr 17 12:58:30 EDT 2017 ppc64le ppc64le ppc64le GNU/Linux
Thanks
== Comment: #9 - MAHESH J. SALGAONKAR <email address hidden> - 2017-04-18 06:07:56 ==
(In reply to comment #8)
> Hi Mahesh
> Tested all the HMI Recoverable errors on the below patched kernel, attached
> the corresponding executing logs. All tests are working fine.
>
> Linux p8wookie 4.10.0-
> 2017 ppc64le ppc64le ppc64le GNU/Linux
>
>
> Thanks
Thanks. Michael has posted fix for this upstream.
http://
I will rebuild the new ubuntu kernel with above patch.
== Comment: #12 - Pridhiviraj Paidipeddi <email address hidden> - 2017-04-18 09:27:59 ==
(In reply to comment #11)
> >
> > https:/
>
> I have built new kernel with above patch and you can find it below path
>
>:/home2/
> generic_
Tested with this new patched kernel, all tests are working fine.
Linux p8wookie 4.10.0-
Will attach is full the execution logs here.
== Comment: #13 - Pridhiviraj Paidipeddi <email address hidden> - 2017-04-18 09:29:43 ==
== Comment: #14 - MAHESH J. SALGAONKAR <email address hidden> - 2017-04-19 03:52:18 ==
(In reply to comment #12)
> (In reply to comment #11)
> > >
> > > https:/
> >
Thanks for testing. We need to mirror this to ubuntu for fix patch inclusion
>
> Linux p8wookie 4.10.0-
> 2017 ppc64le ppc64le ppc64le GNU/Linux
>
> Will attach is full the execution logs here.
CVE References
Changed in kernel-package (Ubuntu): | |
assignee: | Taco Screen team (taco-screen-team) → Manoj Iyer (manjo) |
tags: | added: ubuntu-17.04 |
Changed in ubuntu-power-systems: | |
importance: | Undecided → High |
affects: | kernel-package (Ubuntu) → linux (Ubuntu) |
tags: | added: kernel-da-key |
Changed in linux (Ubuntu): | |
status: | In Progress → Fix Released |
tags: | added: triage-g |
Changed in linux (Ubuntu Zesty): | |
status: | New → In Progress |
Changed in linux (Ubuntu Zesty): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Zesty): | |
importance: | Undecided → High |
Changed in ubuntu-power-systems: | |
status: | In Progress → Fix Released |
Default Comment by Bridge