Fix for flushing TM on coredump only if CPU has TM feature

Bug #1763685 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
High
Canonical Kernel Team
linux (Ubuntu)
Fix Released
High
Joseph Salisbury
Xenial
Invalid
High
Joseph Salisbury
Artful
Fix Released
High
Joseph Salisbury

Bug Description

Problem description
======================
Fix for flushing TM on coredump only if CPU has TM feature

---Additional Hardware Info---
POWER9/POWER8/compat mode

Machine Type = P9 baremetal + VM (POWER9, POWER8, Compat mode)

---Steps to Reproduce---
 On POWER9 machines it's possible that TM is disabled for use by the VMs and if a coredump is generated in the VM it will crash since it will execute TM instructions when coredumping if a check is not present on the VM's kernel. Since POWER9 can run VM on P8 compatibility mode, it's necessary to patch all kernels that run on compat mode as well.

Stack trace output:
 na

Oops output:
 PID: 16438 TASK: c000000272f515e0 CPU: 3 COMMAND: "vma05_vdso"
 #0 [c0000002711f7050] crash_kexec at c0000000001a07e4
 #1 [c0000002711f7080] die at c000000000025278
 #2 [c0000002711f7120] _exception at c000000000025594
 #3 [c0000002711f72b0] program_check_exception at c000000000a0e1b8
 #4 [c0000002711f7330] program_check_common at c000000000006308
 Program Check [700] exception frame:
 R0: 0000000000000000 R1: c0000002711f7620 R2: c000000001274700
 R3: c000000272f51af0 R4: 800000010280b033 R5: 0000000000000000
 R6: 0000000000000100 R7: 0000000000000000 R8: 0000000000000000
 R9: 0000000200000000 R10: 0000000000000000 R11: 0000000000000000
 R12: c000000000010720 R13: c000000007b81b00 R14: 0000000000000000
 R15: 0000000000000000 R16: c0000002711f7db0 R17: 0000000000040006
 R18: c00000002ab95800 R19: 0000000000000100 R20: 0000000000000001
 R21: 0000000000000002 R22: c000000000bfc1c8 R23: c0000002711f79b8
 R24: c000000000a30480 R25: c000000000a30478 R26: 0000000000000018
 R27: 0000000000000000 R28: c00000002ab95800 R29: 0000000000000000
 R30: 0000000000000100 R31: c000000272f515e0
 NIP: c00000000005b10c MSR: 800000010288b033 OR3: c0000000000108e0
 CTR: c000000000010720 LR: c0000000000108e4 XER: 0000000020000000
 CCR: 0000000028002448 MQ: 0000000000000001 DAR: c000000275599748
 DSISR: c000000274092988 Syscall Result: 0000000000000000
 #5 [c0000002711f7620] tm_save_sprs at c00000000005b10c
 [Link Register] [c0000002711f7620] vsr_get at c0000000000108e4
 #6 [c0000002711f7770] fill_thread_core_info at c0000000003d8b44
 #7 [c0000002711f7820] fill_note_info at c0000000003d8e94
 #8 [c0000002711f78b0] elf_core_dump at c0000000003d94d4
 #9 [c0000002711f7a90] do_coredump at c0000000003dfcf4
#10 [c0000002711f7c20] get_signal_to_deliver at c0000000001061d4
#11 [c0000002711f7d10] do_signal at c00000000001beac
#12 [c0000002711f7e00] do_notify_resume at c00000000001c2cc
#13 [c0000002711f7e30] ret_from_except_lite at c00000000000a7b0
 System Call [c00] exception frame:
 R0: 00000000000000fa R1: 00003fffd0470f00 R2: 00003fffa8af7f00
 R3: 0000000000000000 R4: 0000000000004036 R5: 000000000000000b
 R6: 00003fffd0471428 R7: 0000000010000770 R8: 0000000000004036
 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000
 R12: 0000000000000000 R13: 00003fffa8babb80 R14: 0000000000000000
 R15: 0000000000000000 R16: 0000000000000000 R17: 0000000000000000
 R18: 0000000000000000 R19: 0000000000000000 R20: 0000000000000000
 R21: 0000000000000000 R22: 0000000000000000 R23: 0000000000000000
 R24: 0000000000000000 R25: 0000000000000000 R26: 0000000000000000
 R27: 00003fffa8b9fbb8 R28: 00003fffa8ba0000 R29: 00003fffa8b9f550
 R30: 0000000000000000 R31: 0000000000000000
 NIP: 00003fffa8ad54c8 MSR: 800000000000d033 OR3: 0000000000004036
 CTR: 0000000000000000 LR: 000000001000055c XER: 0000000000000000
 CCR: 0000000042000442 MQ: 0000000000000001 DAR: 00003fffa89b2100
 DSISR: 0000000040000000 Syscall Result: 0000000000000000

== Comment: #1 - Gustavo Bueno Romero <email address hidden> - 2018-04-12 17:24:21 ==
Dear maintainer, please cherry-pick the fix alreayd available upstream containing the additional check to avoid the issue here described. It must apply cleanly on stable kernels:

"powerpc/tm: Flush TM only if CPU has TM feature":
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c1fa0768a8713b135848f78fd43ffc208d8ded70

Please cherry-pick the pointed out fix and apply it to kernel:

HWE 4.x
HWE 4.13

HWE-edge 4.15 already has the fix in place.

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-166685 severity-high targetmilestone-inin16044
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
tags: added: triage-g
Manoj Iyer (manjo)
Changed in linux (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → High
Frank Heimes (fheimes)
tags: added: kernel-da-key
Changed in linux (Ubuntu):
status: New → Triaged
Changed in linux (Ubuntu Xenial):
status: New → Triaged
Changed in linux (Ubuntu Artful):
status: New → Triaged
Changed in linux (Ubuntu Xenial):
importance: Undecided → High
Changed in linux (Ubuntu Artful):
importance: Undecided → High
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Commit c1fa0768a8713b135848f78fd43ffc208d8ded70 is already in Artful as of version: Ubuntu-4.13.0-17.20. It came in with the 4.13.5 upstream stable updates.

Commit c1fa0768a8713b135848f78fd43ffc208d8ded70 is not in upstream stable v4.4.y. This is probably because the commit needs a back port for v4.4. I performed a back port to the Xenial 4.4 kernel and built a test kernel with it.

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1763685

Can you test this kernel and see if it resolves this bug?

Note, to test this kernel, you need to install both the linux-image and linux-image-extra .deb packages.

Thanks in advance!

Changed in linux (Ubuntu Artful):
status: Triaged → Fix Released
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Xenial):
assignee: nobody → Joseph Salisbury (jsalisbury)
status: Triaged → In Progress
Changed in linux (Ubuntu):
assignee: Canonical Kernel Team (canonical-kernel-team) → Joseph Salisbury (jsalisbury)
status: Triaged → Fix Released
Changed in ubuntu-power-systems:
status: Triaged → In Progress
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-04-23 16:27 EDT-------
Hi Joseph,

Thanks for the new kernel.

I need more time to test it since I don't have a proper P9 machine at hand right now. What's the hard deadline to test it?

Thanks.

Regards,
Gustavo

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

The last day for the current SRU cycle was on April 20th, which will release to updates on May 14th.

The last day for kernel commits for the next cycle in May 11th, which will release on June 4th, so there should be plenty of time for testing.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-04-24 10:31 EDT-------
OK. Thanks.

Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: In Progress → Fix Committed
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: Fix Committed → Incomplete
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Is there any update on testing of the test kernel?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-06-14 19:56 EDT-------
(In reply to comment #13)
> Is there any update on testing of the test kernel?

Yes, I'm about to check it tomorrow but I realized that you generated the images for Z and not Power? Could you please generate that again. Also could you please make it available that source package?

Thanks.

Changed in ubuntu-power-systems:
status: Incomplete → In Progress
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Now that I review this again, I don't think commit c1fa0768a8713b135848f78fd43ffc208d8ded70 is needed in Xenial proper, since it is 4.4 based. Commit c1fa0768a871 fixes the following commit:
cd63f3c ("powerpc/tm: Fix saving of TM SPRs in core dump")

However, commit cd63f3c was not added to mainline until v4.13-rc4.

Can you confirm this? If commit c1fa0768a871 is not needed in Xenial, there is nothing to test and this bug should be resolved since it's specific to Artful and Xenial HWE kerenl(4.13 based).

Changed in linux (Ubuntu Xenial):
status: In Progress → Invalid
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-06-15 15:05 EDT-------
(In reply to comment #15)
> Now that I review this again, I don't think commit
> c1fa0768a8713b135848f78fd43ffc208d8ded70 is needed in Xenial proper, since
> it is 4.4 based. Commit c1fa0768a871 fixes the following commit:
> cd63f3c ("powerpc/tm: Fix saving of TM SPRs in core dump")

Yes, if cd63f3c is not included, so c1fa0768a871 is not necessary.

> However, commit cd63f3c was not added to mainline until v4.13-rc4.

Yes, that's correct.

> Can you confirm this? If commit c1fa0768a871 is not needed in Xenial, there
> is nothing to test and this bug should be resolved since it's specific to
> Artful and Xenial HWE kerenl(4.13 based).

I don't know Canonical's plan exactly for 4.4. But if you confirm Xenial is based on 4.4
and it does not include cd63f3c, so we are fine and there is nothing to test on 4.4 based.

Is HWE 4.13 based affected? I see cd63f3c but not c1fa0768a871 in:

Ubuntu-hwe-4.13.0-42.47_16.04.1

but not sure if it's the right Ubuntu branch.

Thanks.

Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: In Progress → Fix Released
bugproxy (bugproxy)
tags: added: targetmilestone-inin1710
removed: targetmilestone-inin16044
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.