Kernel can be oopsed using remap_file_pages

Bug #1558120 reported by Colin Ian King on 2016-03-16
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Colin Ian King
Wily
High
Unassigned
Xenial
High
Unassigned

Bug Description

[SRU][WILY][XENIAL]

[JUSTIFICATION]
Running stress-ng --remap 4 will trip an oops on the remap.

The bug is introduced by the mm/mmap.c changes in patch d15bd6cdbb1c2080fb1fca0035e5af1994f4d14f ("UBUNTU: SAUCE: AUFS"). AUFS introduced a subtle bug into remap_file_pages; calls to do_mmap_pgoff can lead to a change of the vma->vm_file and so the vma_fput(vma) on the file is incorrect; we should instead fput on the original file.

[FIX]
fput the original file rather than the vma->vm_file. Without the fix, stress-ng --remap 4 will produce an oops in a few seconds, with the fix it is rock solid.

[REGRESSION POTENTIAL]
This only changes the deprecated system call remap_file_pages which is not used much and it is also deprecated, so it should be avoided by user space applications anyhow.

--------------------------------------------------------------------

While faffing around with the deprecated system call remap_file_pages I was able to trigger an OOPs that can be reproduced every time.

uname -a
Linux lenovo 4.4.0-13-generic #29-Ubuntu SMP Fri Mar 11 19:31:18 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

[ 27.298469] mmap: stress-ng-remap (4061) uses deprecated remap_file_pages() syscall. See Documentation/vm/remap_file_pages.txt.
[ 28.956497] BUG: unable to handle kernel NULL pointer dereference at 0000000000000228
[ 28.956555] IP: [<ffffffff811a94f8>] shmem_fault+0x38/0x1e0
[ 28.956594] PGD aded1067 PUD add32067 PMD 0
[ 28.956625] Oops: 0000 [#1] SMP
[ 28.956649] Modules linked in: nls_iso8859_1 drbg ansi_cprng xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables binfmt_misc zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) uvcvideo intel_rapl x86_pkg_temp_thermal intel_powerclamp videobuf2_vmalloc coretemp videobuf2_memops crct10dif_pclmul videobuf2_v4l2 crc32_pclmul videobuf2_core v4l2_common snd_hda_codec_hdmi videodev aesni_intel snd_hda_codec_realtek snd_hda_codec_generic media aes_x86_64 lrw snd_seq_midi gf128mul glue_helper ablk_helper snd_seq_midi_event cryptd snd_hda_intel snd_hda_codec snd_hda_core
[ 28.957162] snd_hwdep snd_rawmidi joydev input_leds arc4 serio_raw rtl8192ce rtl_pci rtl8192c_common snd_pcm rtlwifi snd_seq mac80211 thinkpad_acpi nvram cfg80211 snd_seq_device mei_me mei lpc_ich snd_timer shpchp snd soundcore mac_hid kvm_intel kvm irqbypass parport_pc ppdev lp parport autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mmc_block i915 psmouse i2c_algo_bit drm_kms_helper e1000e ahci syscopyarea libahci sdhci_pci sysfillrect sysimgblt sdhci ptp fb_sys_fops pps_core drm wmi fjes video
[ 28.957570] CPU: 2 PID: 4061 Comm: stress-ng-remap Tainted: P O 4.4.0-13-generic #29-Ubuntu
[ 28.957623] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET31WW (1.11 ) 05/24/2012
[ 28.957666] task: ffff8800add2ee00 ti: ffff8800adf7c000 task.ti: ffff8800adf7c000
[ 28.957707] RIP: 0010:[<ffffffff811a94f8>] [<ffffffff811a94f8>] shmem_fault+0x38/0x1e0
[ 28.957754] RSP: 0000:ffff8800adf7fd38 EFLAGS: 00010246
[ 28.957780] RAX: ffff880194f06900 RBX: 0000000000000000 RCX: 0000000000000054
[ 28.957820] RDX: 0000000000000000 RSI: ffff8800adf7fda8 RDI: ffff8800a990f0c8
[ 28.957860] RBP: ffff8800adf7fd98 R08: 0000000000000000 R09: ffff8800adf7fe68
[ 28.957899] R10: 0000000000000000 R11: 00003ffffffff000 R12: ffff8800a990f0c8
[ 28.957939] R13: ffff8800adf7fe68 R14: ffff8800adf0de90 R15: 00007f83ba57b000
[ 28.957979] FS: 00007f83bc46c740(0000) GS:ffff88019e280000(0000) knlGS:0000000000000000
[ 28.958024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 28.958056] CR2: 0000000000000228 CR3: 00000000ade92000 CR4: 00000000001406e0
[ 28.958096] Stack:
[ 28.958109] ffff8800aafb3840 00000200adf7fd68 ffff8800adfaf108 ffff8800adfaf190
[ 28.958158] ffffffff81a25e80 ffff8800adfaf190 0000000000000000 00000000b7865150
[ 28.958206] 0000000000000000 ffff8800a990f0c8 ffff8800adf7fe68 ffff8800adf0de90
[ 28.958254] Call Trace:
[ 28.958273] [<ffffffff811ba900>] __do_fault+0x50/0xe0
[ 28.958305] [<ffffffff811be33b>] handle_mm_fault+0xf8b/0x1820
[ 28.958339] [<ffffffff81221e52>] ? __dentry_kill+0x162/0x1e0
[ 28.958374] [<ffffffff8122b6a4>] ? mntput+0x24/0x40
[ 28.958405] [<ffffffff8106a537>] __do_page_fault+0x197/0x400
[ 28.958439] [<ffffffff8106a7c2>] do_page_fault+0x22/0x30
[ 28.958472] [<ffffffff8181eef8>] page_fault+0x28/0x30
[ 28.958501] Code: 41 54 53 49 89 fc 48 83 ec 40 c7 45 ac 00 02 00 00 65 48 8b 04 25 28 00 00 00 48 89 45 d8 31 c0 48 8b 87 a0 00 00 00 48 8b 58 20 <48> 83 bb 28 02 00 00 00 0f 85 98 00 00 00 48 8b 43 30 48 8d 56
[ 28.958726] RIP [<ffffffff811a94f8>] shmem_fault+0x38/0x1e0

How to reproduce:

git clone git://kernel.ubuntu.com/cking/stress-ng
cd stress-ng
make clean; make
./stress-ng --remap 8 -t 20
---
ApportVersion: 2.20-0ubuntu3
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/pcmC0D0p: king 2522 F...m pulseaudio
 /dev/snd/controlC0: king 2522 F.... pulseaudio
CurrentDesktop: Unity
DistroRelease: Ubuntu 16.04
EcryptfsInUse: Yes
HibernationDevice: RESUME=UUID=bdef26b7-e88c-4196-97a3-b6d47447ce86
InstallationDate: Installed on 2015-11-04 (135 days ago)
InstallationMedia: Ubuntu 15.04 "Vivid Vervet" - Release amd64 (20150422)
MachineType: LENOVO 2320CTO
Package: linux (not installed)
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-13-generic root=UUID=324e5943-0fda-445d-a814-d3a80ff92ab8 ro quiet splash nomdmonddf nomdmonisw vt.handoff=7
ProcVersionSignature: Ubuntu 4.4.0-13.29-generic 4.4.5
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-13-generic N/A
 linux-backports-modules-4.4.0-13-generic N/A
 linux-firmware 1.156
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
Tags: xenial
Uname: Linux 4.4.0-13-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip libvirtd lpadmin lxd plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 05/24/2012
dmi.bios.vendor: LENOVO
dmi.bios.version: G2ET31WW (1.11 )
dmi.board.asset.tag: Not Available
dmi.board.name: 2320CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvrG2ET31WW(1.11):bd05/24/2012:svnLENOVO:pn2320CTO:pvrThinkPadX230:rvnLENOVO:rn2320CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 2320CTO
dmi.product.version: ThinkPad X230
dmi.sys.vendor: LENOVO
---
ApportVersion: 2.20-0ubuntu3
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/pcmC0D0p: king 2522 F...m pulseaudio
 /dev/snd/controlC0: king 2522 F.... pulseaudio
CurrentDesktop: Unity
DistroRelease: Ubuntu 16.04
EcryptfsInUse: Yes
HibernationDevice: RESUME=UUID=bdef26b7-e88c-4196-97a3-b6d47447ce86
InstallationDate: Installed on 2015-11-04 (135 days ago)
InstallationMedia: Ubuntu 15.04 "Vivid Vervet" - Release amd64 (20150422)
MachineType: LENOVO 2320CTO
Package: linux (not installed)
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-13-generic root=UUID=324e5943-0fda-445d-a814-d3a80ff92ab8 ro quiet splash nomdmonddf nomdmonisw vt.handoff=7
ProcVersionSignature: Ubuntu 4.4.0-13.29-generic 4.4.5
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-13-generic N/A
 linux-backports-modules-4.4.0-13-generic N/A
 linux-firmware 1.156
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
Tags: xenial
Uname: Linux 4.4.0-13-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip libvirtd lpadmin lxd plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 05/24/2012
dmi.bios.vendor: LENOVO
dmi.bios.version: G2ET31WW (1.11 )
dmi.board.asset.tag: Not Available
dmi.board.name: 2320CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvrG2ET31WW(1.11):bd05/24/2012:svnLENOVO:pn2320CTO:pvrThinkPadX230:rvnLENOVO:rn2320CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 2320CTO
dmi.product.version: ThinkPad X230
dmi.sys.vendor: LENOVO
---
ApportVersion: 2.20-0ubuntu3
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/pcmC0D0p: king 2522 F...m pulseaudio
 /dev/snd/controlC0: king 2522 F.... pulseaudio
CurrentDesktop: Unity
DistroRelease: Ubuntu 16.04
EcryptfsInUse: Yes
HibernationDevice: RESUME=UUID=bdef26b7-e88c-4196-97a3-b6d47447ce86
InstallationDate: Installed on 2015-11-04 (135 days ago)
InstallationMedia: Ubuntu 15.04 "Vivid Vervet" - Release amd64 (20150422)
MachineType: LENOVO 2320CTO
Package: linux (not installed)
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-13-generic root=UUID=324e5943-0fda-445d-a814-d3a80ff92ab8 ro quiet splash nomdmonddf nomdmonisw vt.handoff=7
ProcVersionSignature: Ubuntu 4.4.0-13.29-generic 4.4.5
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-13-generic N/A
 linux-backports-modules-4.4.0-13-generic N/A
 linux-firmware 1.156
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
Tags: xenial
Uname: Linux 4.4.0-13-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip libvirtd lpadmin lxd plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 05/24/2012
dmi.bios.vendor: LENOVO
dmi.bios.version: G2ET31WW (1.11 )
dmi.board.asset.tag: Not Available
dmi.board.name: 2320CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvrG2ET31WW(1.11):bd05/24/2012:svnLENOVO:pn2320CTO:pvrThinkPadX230:rvnLENOVO:rn2320CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 2320CTO
dmi.product.version: ThinkPad X230
dmi.sys.vendor: LENOVO

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1558120

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete

apport information

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: apport-collected xenial
description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Colin Ian King (colin-king) wrote :

The merge base e50f9acafe11781b0163d27172d33faf824caf7b is bad.
This means the bug has been fixed between e50f9acafe11781b0163d27172d33faf824caf7b and [d207274815a45631952c342a3b6959e79913195b].

Changed in linux (Ubuntu):
assignee: nobody → Colin Ian King (colin-king)
Colin Ian King (colin-king) wrote :

Bizarre as it seems, disabling the kernel module tainting stops this occurring.

 static inline void add_taint_module(struct module *mod, unsigned flag,
                                    enum lockdep_ok lockdep_ok)
 {
- add_taint(flag, lockdep_ok);
- mod->taints |= (1U << flag);
+ //add_taint(flag, lockdep_ok);
+ //mod->taints |= (1U << flag);
 }

Colin Ian King (colin-king) wrote :

Sanity checked above, seems like some debug in the mmap serializes the do_mmap'ing call and is the reasons why I failed to hit the bug with the above change. So ignore comment #45

Colin Ian King (colin-king) wrote :

Reverting dee5220cf56dd1b998c74514d0f023b55e4eb425 ("UBUNTU: SAUCE: AUFS") allows me to run stress-ng --remap 8 w/o any crashes.

description: updated
description: updated
Changed in linux (Ubuntu Wily):
status: New → Fix Committed
Changed in linux (Ubuntu Xenial):
status: New → Fix Committed
Changed in linux (Ubuntu):
status: Confirmed → Fix Released
status: Fix Released → Fix Committed
Changed in linux (Ubuntu Wily):
importance: Undecided → High
Changed in linux (Ubuntu Xenial):
importance: Undecided → High
Kamal Mostafa (kamalmostafa) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-wily' to 'verification-done-wily'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-wily
tags: added: verification-needed-xenial
Kamal Mostafa (kamalmostafa) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

Colin Ian King (colin-king) wrote :

It appears that a subsequent fix for this from the aufs developer has landed:

https://github.com/sfjro/aufs4-linux/commit/2d530d0b039ca2b1280598cf8b350d6c6552f7b8

I think we should drop my fix and pick up the official aufs fix instead. I'll re-test with this and then re-submit the patch once I am satisfied with it.

Launchpad Janitor (janitor) wrote :
Download full text (16.9 KiB)

This bug was fixed in the package linux - 4.4.0-23.41

---------------
linux (4.4.0-23.41) xenial; urgency=low

  [ Kamal Mostafa ]

  * Release Tracking Bug
    - LP: #1582431

  * zfs: disable module checks for zfs when cross-compiling (LP: #1581127)
    - [Packaging] disable zfs module checks when cross-compiling

  * Xenial update to v4.4.10 stable release (LP: #1580754)
    - Revert "UBUNTU: SAUCE: (no-up) ACPICA: Dispatcher: Update thread ID for
      recursive method calls"
    - Revert "UBUNTU: SAUCE: nbd: ratelimit error msgs after socket close"
    - Revert: "powerpc/tm: Check for already reclaimed tasks"
    - RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips
    - ipvs: handle ip_vs_fill_iph_skb_off failure
    - ipvs: correct initial offset of Call-ID header search in SIP persistence
      engine
    - ipvs: drop first packet to redirect conntrack
    - mfd: intel-lpss: Remove clock tree on error path
    - nbd: ratelimit error msgs after socket close
    - ata: ahci_xgene: dereferencing uninitialized pointer in probe
    - mwifiex: fix corner case association failure
    - CNS3xxx: Fix PCI cns3xxx_write_config()
    - clk-divider: make sure read-only dividers do not write to their register
    - soc: rockchip: power-domain: fix err handle while probing
    - clk: rockchip: free memory in error cases when registering clock branches
    - clk: meson: Fix meson_clk_register_clks() signature type mismatch
    - clk: qcom: msm8960: fix ce3_core clk enable register
    - clk: versatile: sp810: support reentrance
    - clk: qcom: msm8960: Fix ce3_src register offset
    - lpfc: fix misleading indentation
    - ath9k: ar5008_hw_cmn_spur_mitigate: add missing mask_m & mask_p
      initialisation
    - mac80211: fix statistics leak if dev_alloc_name() fails
    - tracing: Don't display trigger file for events that can't be enabled
    - MD: make bio mergeable
    - Minimal fix-up of bad hashing behavior of hash_64()
    - mm, cma: prevent nr_isolated_* counters from going negative
    - mm/zswap: provide unique zpool name
    - ARM: EXYNOS: Properly skip unitialized parent clock in power domain on
    - ARM: SoCFPGA: Fix secondary CPU startup in thumb2 kernel
    - xen: Fix page <-> pfn conversion on 32 bit systems
    - xen/balloon: Fix crash when ballooning on x86 32 bit PAE
    - xen/evtchn: fix ring resize when binding new events
    - HID: wacom: Add support for DTK-1651
    - HID: Fix boot delay for Creative SB Omni Surround 5.1 with quirk
    - Input: zforce_ts - fix dual touch recognition
    - proc: prevent accessing /proc/<PID>/environ until it's ready
    - mm: update min_free_kbytes from khugepaged after core initialization
    - batman-adv: fix DAT candidate selection (must use vid)
    - batman-adv: Check skb size before using encapsulated ETH+VLAN header
    - batman-adv: Fix broadcast/ogm queue limit on a removed interface
    - batman-adv: Reduce refcnt of removed router when updating route
    - writeback: Fix performance regression in wb_over_bg_thresh()
    - MAINTAINERS: Remove asterisk from EFI directory names
    - x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO
    - ARM: cpuidle: Pass on arm_cpuidle_s...

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Colin Ian King (colin-king) wrote :

Tested on Xenial, 4.4.0-23-generic #41-Ubuntu, stress-ng --remap 8 can run without tripping the issue.

tags: added: verification-done-xenial
removed: verification-needed-xenial
Colin Ian King (colin-king) wrote :

Tested on Wily, 4.2.0-37-generic #43-Ubuntu, stress-ng --remap 8 can run without tripping the issue.

tags: added: verification-done-wily
removed: verification-needed-wily
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.2.0-38.45

---------------
linux (4.2.0-38.45) wily; urgency=low

  [ Kamal Mostafa ]

  * CVE-2016-1583 (LP: #1588871)
    - ecryptfs: fix handling of directory opening
    - SAUCE: proc: prevent stacking filesystems on top
    - SAUCE: ecryptfs: forbid opening files without mmap handler
    - SAUCE: sched: panic on corrupted stack end

 -- Andy Whitcroft <email address hidden> Wed, 08 Jun 2016 22:10:39 +0100

Changed in linux (Ubuntu Wily):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.4.0-24.43

---------------
linux (4.4.0-24.43) xenial; urgency=low

  [ Kamal Mostafa ]

  * CVE-2016-1583 (LP: #1588871)
    - ecryptfs: fix handling of directory opening
    - SAUCE: proc: prevent stacking filesystems on top
    - SAUCE: ecryptfs: forbid opening files without mmap handler
    - SAUCE: sched: panic on corrupted stack end

  * arm64: statically link rtc-efi (LP: #1583738)
    - [Config] Link rtc-efi statically on arm64

 -- Kamal Mostafa <email address hidden> Fri, 03 Jun 2016 10:02:16 -0700

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.