Signal 7 error when running GPFS tracing in cluster

Bug #1792195 reported by bugproxy
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
Critical
Canonical Kernel Team
linux (Ubuntu)
Fix Released
Critical
Joseph Salisbury
Bionic
Fix Released
Critical
Joseph Salisbury
Cosmic
Fix Released
Critical
Joseph Salisbury

Bug Description

== SRU Justification ==
IBM is requesting these commits in bionic and cosmic. These commits
also rely on commit 7acf50e4efa6, which was SRU'd in bug 1792102.

Description of bug:
GPFS mmfsd daemon is mapping shared tracing buffer(allocated from kernel
driver using vmalloc) and then writing trace records from user space threads
in parallel. While the SIGBUS happened, the access virtual memory address
is in the mapped range, no overflow on access.

The root cause is that for PTEs created by a driver at mmap time (ie, that
aren't created dynamically at fault time), it's not legit for ptep_set_access_flags()
to make them invalid even temporarily. A concurrent access while they are
invalid will be unable to service the page fault and will cause as SIGBUS.

== Fixes ==
bd0dbb73e013 ("powerpc/mm/books3s: Add new pte bit to mark pte temporarily invalid.")
f08d08f3db55 ("powerpc/mm/radix: Only need the Nest MMU workaround for R -> RW transition")

== Regression Potential ==
Low. Limited to powerpc.

== Test Case ==
A test kernel was built with these patches and tested by IBM.
IBM states the test kernel resolved the bug.

-- Problem Description --
GPFS mmfsd daemon is mapping shared tracing buffer(allocated from kernel driver using vmalloc) and then writing trace records from user space threads in parallel. While the SIGBUS happened, the access virtual memory address is in the mapped range, no overflow on access.

Worked with Benjamin Herrenschmidt on GPFS tracing kernel driver code and he made a suggestion as workaround on the driver code to bypass the problem, and it works....

the workaround code change as below:

 - rc = remap_pfn_range(vma, start, pfn, PAGE_SIZE, PAGE_SHARED);
+ rc = remap_pfn_range(vma, start, pfn, PAGE_SIZE, __pgprot(pgprot_val(PAGE_SHARED)|_PAGE_DIRTY);

As Benjamin mentioned, this is a Linux kernel bug and this is just a workaround. He will give the details about the kernel bug and why this workaround works....

The root cause is that for PTEs created by a driver at mmap time (ie, that aren't created dynamically at fault time), it's not legit for ptep_set_access_flags() to make them invalid even temporarily. A concurrent access while they are invalid will be unable to service the page fault and will cause as SIGBUS.

Thankfully such PTEs shouldn't normally be the subject of a RO->RW privilege escalation.

What happens is that the GPFS driver creates the PTEs using remap_pfn_range(...,PAGE_SHARED).

PAGE_SHARED has _PAGE_ACCESSED (R) but not _PAGE_DIRTY (C) set.

Thus on the first write, we try set C and while doing so, hit the above workaround, which causes the problem described earlier.

The proposed patch will ensure we only do the Nest MMU hack when changing _PAGE_RW and not for normal R/C updates.

The workaround tested by the GPFS team consists of adding _PAGE_DIRTY to the mapping created by remap_pfn_range() to avoid the RC update fault completely.

This is fixed by these:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bd0dbb73e01306a1060e56f81e5fe287be936477

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f08d08f3db55452d31ba4a37c702da6245876b96

Since DD1 support is still in (ie, 2bf1071a8d50928a4ae366bb3108833166c2b70c is not applied) the second doesn't apply cleanly. Did you want that attached?

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-171273 severity-high targetmilestone-inin1804
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1792195/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
affects: ubuntu → linux (Ubuntu)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.19 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.19-rc4

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: New → Incomplete
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu):
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Changed in ubuntu-power-systems:
importance: Undecided → High
Changed in linux (Ubuntu):
status: Incomplete → Triaged
assignee: Canonical Kernel Team (canonical-kernel-team) → Joseph Salisbury (jsalisbury)
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: New → Triaged
Changed in linux (Ubuntu):
status: Triaged → In Progress
Changed in linux (Ubuntu Bionic):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Joseph Salisbury (jsalisbury)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I tried to apply 2bf1071a8d50928a4ae366bb3108833166c2b70c to Bionic, but it looks like a backport is needed.

It appears 2bf1071a8d5 is needed as a prereq for f08d08f3db5545. Do you happen to already have a backport of 2bf1071a8d5 already done for Bionic?

Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: Triaged → Incomplete
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-10-09 13:31 EDT-------
We can recreate this without GPFS using a modified version of the tests here:
https://github.com/NVIDIA/gdrcopy

Working on the DD1 removal backport.

Revision history for this message
bugproxy (bugproxy) wrote : backported dd1 plus patch

------- Comment on attachment From <email address hidden> 2018-10-10 03:32 EDT-------

This patch has the dd1 removal (2bf1071a8d50928a4ae366bb3108833166c2b70c) plus adds 9e9626ed3a4affe7fe0e17e98c357849ad299e50 to that for context and additional fix.

I tested this with the patches and it looks good for this one and for LP 1792102.

Changed in ubuntu-power-systems:
status: Incomplete → In Progress
Changed in ubuntu-power-systems:
importance: High → Critical
Manoj Iyer (manjo)
Changed in linux (Ubuntu):
importance: Medium → Critical
Changed in linux (Ubuntu Bionic):
importance: Medium → Critical
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Thanks for the backport!

I built a test kernel with commits f08d08f3db5545 and your backport of 2bf1071a8d50928. The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1792195

Can you test this kernel and see if it resolves this bug?

Note about installing test kernels:
• If the test kernel is prior to 4.15(Bionic) you need to install the linux-image and linux-image-extra .deb packages.
• If the test kernel is 4.15(Bionic) or newer, you need to install the linux-modules, linux-modules-extra and linux-image-unsigned .deb packages.

Thanks in advance!

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-10-10 13:30 EDT-------
I gave that a try and still saw the problem:
Linux version 4.15.0-36-generic (jsalisbury@kathleen) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #40~lp1792195 SMP Wed Oct 10 13:19:58 UTC 2018

Does that also include bd0dbb73e01306a1060e56f81e5fe287be936477?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

You are correct, commit bd0dbb73e01306a1060e56f81e5fe287be936477 was not in that test kernel. It looks like it needs some backporing as well. I'll do that and build a v2 test kernel.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a v2 test kernel. This test kernel now has the following four commits:

f08d08f3db55 ("powerpc/mm/radix: Only need the Nest MMU workaround for R -> RW transition")
bd0dbb73e013 ("powerpc/mm/books3s: Add new pte bit to mark pte temporarily invalid.")
2bf1071a8d50 ("powerpc/64s: Remove POWER9 DD1 support")
7acf50e4efa6 ("Revert "powerpc/powernv: Increase memory block size to 1GB on radix"")

Commit 7acf50e4efa6 was requested in bug 1792102, but I added it to this test kernel as well.

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1792195

Can you test this kernel and see if it resolves this bug?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-10 16:00 EDT-------
Thanks! I'm building that too, to see if that was the issue. But that patch applies with some (-2, -25) fuzz for me - I tried on master-next, too, and see the same. I'm at fd01374000c8 (-36.39) and I just have the patches from this bug and from LP 1792102....

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-10 17:49 EDT-------
Commit list good, the port for 2bf1071a8d50 also included 749a027 powerpc/64s: Fix DT CPU features Power9 DD2.1 logic as a pre-req/additional fix.

The download directory there is empty, though, so I can't get to the test kernel.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Sorry, it looks like my copy failed. The files should be there now.

http://kernel.ubuntu.com/~jsalisbury/lp1792195

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-10 21:28 EDT-------
Thanks, looks good! Over an hour of runtime, the failure recreates in a minute or so.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
description: updated
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-11 12:27 EDT-------
Just to clarify the previous comment - the bug is fixed with the v2 kernel. I did multiple runs without hitting the problem.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Thanks for the update. I'm also building a Launchpad PPA with these changes. That will allow anyone else that wants to test that way. I'll post a link when the PPA is ready.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-15 02:32 EDT-------
A belated update - the ppa kernel looks good, too, I did try that out and verified that both issues are fixed with it. Thank you!

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

In regards to commit 2bf1071a8d509:

The SRU team would not like to see such a large portion of code getting removed after release. One never knows whether there really was nothing that did not still make use of some parts.

I believe 2bf1071a8d5 is needed as a prereq for f08d08f3db5545. Do you think 2bf1071a8d5 will be required to resolve this bug? I can investigate as well to see if f08d08f3db5545 can be backported to not require 2bf1071a8d5.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built another test kernel, without commit 2bf1071a8d5. This test kernel has the three commit:

7acf50e4efa6 ("Revert "powerpc/powernv: Increase memory block size to 1GB on radix"")
f08d08f3db55 ("powerpc/mm/radix: Only need the Nest MMU workaround for R -> RW transition")
bd0dbb73e013 ("powerpc/mm/books3s: Add new pte bit to mark pte temporarily invalid.")

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1792195

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-15 17:55 EDT-------
This kernel also looks good - I did a couple 10+ minutes runs against it without any problems, which shows the problem being fixed.

I think that leaving out DD1 is ok - but at some point if it's still there it may make some other backport much harder - but we can also just deal with that when needed.

The backport removing DD1 also included the fix for powerpc/64s: dt_cpu_ftrs fix POWER9 DD2.2 and above (9e9626e) - but maybe I should open a bug to get that into the next SRU?

description: updated
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-cosmic' to 'verification-done-cosmic'. If the problem still exists, change the tag 'verification-needed-cosmic' to 'verification-failed-cosmic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-cosmic
Revision history for this message
Mike Ranweiler (mranweil) wrote :

I tested this bug against the -proposed kernel and it looks good - the problem looks fixed.

user@deb3qwsp1:~/gdrcopy$ cat /proc/version
Linux version 4.15.0-39-generic (buildd@bos02-ppc64el-016) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #42-Ubuntu SMP Tue Oct 23 15:41:45 UTC 2018

tags: added: verification-done-bionic
removed: verification-needed-bionic
Changed in linux (Ubuntu Cosmic):
status: In Progress → Fix Committed
Changed in ubuntu-power-systems:
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.4 KiB)

This bug was fixed in the package linux - 4.15.0-39.42

---------------
linux (4.15.0-39.42) bionic; urgency=medium

  * linux: 4.15.0-39.42 -proposed tracker (LP: #1799411)

  * Linux: insufficient shootdown for paging-structure caches (LP: #1798897)
    - mm: move tlb_table_flush to tlb_flush_mmu_free
    - mm/tlb: Remove tlb_remove_table() non-concurrent condition
    - mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE
    - [Config] CONFIG_HAVE_RCU_TABLE_INVALIDATE=y

  * Ubuntu18.04: GPU total memory is reduced (LP: #1792102)
    - Revert "powerpc/powernv: Increase memory block size to 1GB on radix"

  * arm64: snapdragon: reduce boot noise (LP: #1797154)
    - [Config] arm64: snapdragon: DRM_MSM=m
    - [Config] arm64: snapdragon: SND*=m
    - [Config] arm64: snapdragon: disable ARM_SDE_INTERFACE
    - [Config] arm64: snapdragon: disable DRM_I2C_ADV7511_CEC
    - [Config] arm64: snapdragon: disable VIDEO_ADV7511, VIDEO_COBALT

  * [Bionic] CPPC bug fixes (LP: #1796949)
    - ACPI / CPPC: Update all pr_(debug/err) messages to log the susbspace id
    - cpufreq: CPPC: Don't set transition_latency
    - ACPI / CPPC: Fix invalid PCC channel status errors

  * regression in 'ip --family bridge neigh' since linux v4.12 (LP: #1796748)
    - rtnetlink: fix rtnl_fdb_dump() for ndmsg header

  * screen displays abnormally on the lenovo M715 with the AMD GPU (Radeon Vega
    8 Mobile, rev ca, 1002:15dd) (LP: #1796786)
    - drm/amd/display: Fix takover from VGA mode
    - drm/amd/display: early return if not in vga mode in disable_vga
    - drm/amd/display: Refine disable VGA

  * arm64: snapdragon: WARNING: CPU: 0 PID: 1 arch/arm64/kernel/setup.c:271
    reserve_memblock_reserved_regions (LP: #1797139)
    - SAUCE: arm64: Fix /proc/iomem for reserved but not memory regions

  * The front MIC can't work on the Lenovo M715 (LP: #1797292)
    - ALSA: hda/realtek - Fix the problem of the front MIC on the Lenovo M715

  * Keyboard backlight sysfs sometimes is missing on Dell laptops (LP: #1797304)
    - platform/x86: dell-smbios: Correct some style warnings
    - platform/x86: dell-smbios: Rename dell-smbios source to dell-smbios-base
    - platform/x86: dell-smbios: Link all dell-smbios-* modules together
    - [Config] CONFIG_DELL_SMBIOS_SMM=y, CONFIG_DELL_SMBIOS_WMI=y

  * rpi3b+: ethernet not working (LP: #1797406)
    - lan78xx: Don't reset the interface on open

  * 87cdf3148b11 was never backported to 4.15 (LP: #1795653)
    - xfrm: Verify MAC header exists before overwriting eth_hdr(skb)->h_proto

  * [Ubuntu18.04][Power9][DD2.2]package installation segfaults inside debian
    chroot env in P9 KVM guest with HTM enabled (kvm) (LP: #1792501)
    - KVM: PPC: Book3S HV: Fix guest r11 corruption with POWER9 TM workarounds

  * Provide mode where all vCPUs on a core must be the same VM (LP: #1792957)
    - KVM: PPC: Book3S HV: Provide mode where all vCPUs on a core must be the same
      VM

  * fscache: bad refcounting in fscache_op_complete leads to OOPS (LP: #1797314)
    - SAUCE: fscache: Fix race in decrementing refcount of op->npages

  * CVE-2018-9363
    - Bluetooth: hidp: buffer overflow in hidp_process_report

  * CVE-20...

Read more...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.18.0-11.12

---------------
linux (4.18.0-11.12) cosmic; urgency=medium

  * linux: 4.18.0-11.12 -proposed tracker (LP: #1799445)

  * arm64: snapdragon: WARNING: CPU: 0 PID: 1 arch/arm64/kernel/setup.c:271
    reserve_memblock_reserved_regions (LP: #1797139)
    - SAUCE: arm64: Fix /proc/iomem for reserved but not memory regions

  * arm64: snapdragon: WARNING: CPU: 0 PID: 1 at drivers/irqchip/irq-gic.c:1016
    gic_irq_domain_translate (LP: #1797143)
    - SAUCE: arm64: dts: msm8916: camms: fix gic_irq_domain_translate warnings

  * The front MIC can't work on the Lenovo M715 (LP: #1797292)
    - ALSA: hda/realtek - Fix the problem of the front MIC on the Lenovo M715

  * Provide mode where all vCPUs on a core must be the same VM (LP: #1792957)
    - KVM: PPC: Book3S HV: Provide mode where all vCPUs on a core must be the same
      VM

  * fscache: bad refcounting in fscache_op_complete leads to OOPS (LP: #1797314)
    - SAUCE: fscache: Fix race in decrementing refcount of op->npages

  * hns3: autoneg settings get lost on down/up (LP: #1797654)
    - net: hns3: Fix for information of phydev lost problem when down/up

  * not able to unwind the stack from within __kernel_clock_gettime in the Linux
    vDSO (LP: #1797963)
    - powerpc/vdso: Correct call frame information

  * Signal 7 error when running GPFS tracing in cluster (LP: #1792195)
    - powerpc/mm/books3s: Add new pte bit to mark pte temporarily invalid.
    - powerpc/mm/radix: Only need the Nest MMU workaround for R -> RW transition

  * Support Edge Gateway's WIFI LED (LP: #1798330)
    - SAUCE: mwifiex: Switch WiFi LED state according to the device status

  * Support Edge Gateway's Bluetooth LED (LP: #1798332)
    - SAUCE: Bluetooth: Support for LED on Edge Gateways

  * kvm doesn't work on 36 physical bits systems (LP: #1798427)
    - KVM: x86: fix L1TF's MMIO GFN calculation

  * CVE-2018-15471
    - xen-netback: fix input validation in xenvif_set_hash_mapping()

  * regression in 'ip --family bridge neigh' since linux v4.12 (LP: #1796748)
    - rtnetlink: fix rtnl_fdb_dump() for ndmsg header

 -- Stefan Bader <email address hidden> Tue, 23 Oct 2018 18:59:15 +0200

Changed in linux (Ubuntu Cosmic):
status: Fix Committed → Fix Released
status: Fix Committed → Fix Released
Changed in ubuntu-power-systems:
status: Fix Committed → In Progress
Changed in linux (Ubuntu):
status: In Progress → Fix Released
Changed in ubuntu-power-systems:
status: In Progress → Fix Released
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.