Kernel OOPS in ocfs2_fallocate()

Bug #1006012 reported by Jörg Lübbert
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Unassigned
Precise
Fix Released
Undecided
Unassigned

Bug Description

== SRU Justification ==

A NULL pointer will be triggered by ocfs2_fallocate() as it will
invoke __ocfs2_change_file_space() with a NULL as the file argument.
This may result in filesystem corruption.

== Fix ==

This patch has already been submitted upstreams and has been added to
the -mm tree (https://lkml.org/lkml/2012/6/20/686). The fix is a
simple NULL check in the __ocfs2_change_file_space() function.

== Impact ==

Possible filesystem corruption when using fallocate operation.

== Test Case ==

After setting up an ocfs2 node, mount a filesystem and simply execute:

 $ fallocate -l 1600m /mnt/test

=======================================================================

My system always keeps freezing on me when I want to access a clean (fscked) ocfs2 volume on which I extracted some emails:

This is the error:

May 29 18:47:14 mail2 kernel: [ 65.604413] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
May 29 18:47:14 mail2 kernel: [ 65.604751] IP: [<ffffffffa02e169a>] __ocfs2_change_file_space+0x5da/0x710 [ocfs2]
May 29 18:47:14 mail2 kernel: [ 65.605092] PGD 0
May 29 18:47:14 mail2 kernel: [ 65.605238] Oops: 0000 [#1] SMP
May 29 18:47:14 mail2 kernel: [ 65.605462] CPU 0
May 29 18:47:14 mail2 kernel: [ 65.605554] Modules linked in: ocfs2 quota_tree pcnet32 vmblock(O) vmsync(O) vmhgfs(O) ip_vs nf_conntrack libcrc32c ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs dm_round_robin ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ext2 ppdev vmw_balloon psmouse dm_multipath serio_raw parport_pc vmci(O) shpchp i2c_piix4 mac_hid lp parport mptsas mptscsih vmxnet3 mptbase scsi_transport_sas floppy vmxnet(O)
May 29 18:47:14 mail2 kernel: [ 65.609622]
May 29 18:47:14 mail2 kernel: [ 65.609724] Pid: 3805, comm: deliver Tainted: G O 3.2.0-24-generic #38-Ubuntu VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
May 29 18:47:14 mail2 kernel: [ 65.610322] RIP: 0010:[<ffffffffa02e169a>] [<ffffffffa02e169a>] __ocfs2_change_file_space+0x5da/0x710 [ocfs2]
May 29 18:47:14 mail2 kernel: [ 65.610709] RSP: 0018:ffff88003cdbbe48 EFLAGS: 00010246
May 29 18:47:14 mail2 kernel: [ 65.610900] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88003d4e2c00
May 29 18:47:14 mail2 kernel: [ 65.611137] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
May 29 18:47:14 mail2 kernel: [ 65.611542] RBP: ffff88003cdbbec8 R08: 4000000000000000 R09: ffff88003d4e2c00
May 29 18:47:14 mail2 kernel: [ 65.611944] R10: ffff880036f38030 R11: 0000000000000001 R12: ffff88003b8b9000
May 29 18:47:14 mail2 kernel: [ 65.612348] R13: ffff880029a6aff8 R14: ffff880029a6b098 R15: 0000000000000184
May 29 18:47:14 mail2 kernel: [ 65.612761] FS: 00007fc91036a700(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
May 29 18:47:14 mail2 kernel: [ 65.613349] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 29 18:47:14 mail2 kernel: [ 65.613717] CR2: 0000000000000038 CR3: 000000003c234000 CR4: 00000000000006f0
May 29 18:47:14 mail2 kernel: [ 65.614149] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 29 18:47:14 mail2 kernel: [ 65.614582] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
May 29 18:47:14 mail2 kernel: [ 65.614982] Process deliver (pid: 3805, threadinfo ffff88003cdba000, task ffff8800367144d0)
May 29 18:47:14 mail2 kernel: [ 65.615578] Stack:
May 29 18:47:14 mail2 kernel: [ 65.615853] 000000000000fc03 0000000000969da9 ffff880036f38030 0000000000000000
May 29 18:47:14 mail2 kernel: [ 65.616687] 0000000000000184 4030582a00000001 ffff880029a6af18 ffff88003cdbbed8
May 29 18:47:14 mail2 kernel: [ 65.617514] 0000000000000000 ffff880029a0f3a8 0000000030ad64a1 ffff88003c2c4b00
May 29 18:47:14 mail2 kernel: [ 65.625257] Call Trace:
May 29 18:47:14 mail2 kernel: [ 65.625595] [<ffffffffa02e1849>] ocfs2_fallocate+0x79/0x80 [ocfs2]
May 29 18:47:14 mail2 kernel: [ 65.626005] [<ffffffff81176712>] do_fallocate+0xf2/0x160
May 29 18:47:14 mail2 kernel: [ 65.626368] [<ffffffff811767cb>] sys_fallocate+0x4b/0x70
May 29 18:47:14 mail2 kernel: [ 65.626744] [<ffffffff81664d82>] system_call_fastpath+0x16/0x1b
May 29 18:47:14 mail2 kernel: [ 65.627116] Code: 55 68 4c 89 ee 49 89 55 78 48 8b 55 c8 49 89 45 60 49 89 45 70 4c 89 d7 e8 f4 4a 00 00 85 c0 89 c3 4c 8b 55 90 78 22 48 8b 7d 98 <f7> 47 38 00 10 10 00 74 05 41 80 4a 14 01 4c 89 d6 4c 89 e7 e8
May 29 18:47:14 mail2 kernel: [ 65.636086] RIP [<ffffffffa02e169a>] __ocfs2_change_file_space+0x5da/0x710 [ocfs2]
May 29 18:47:14 mail2 kernel: [ 65.636748] RSP <ffff88003cdbbe48>
May 29 18:47:14 mail2 kernel: [ 65.637059] CR2: 0000000000000038
May 29 18:47:14 mail2 kernel: [ 65.637433] ---[ end trace cdb37187fc6b43af ]---

And this is probably the upstream reference:

fallocate() was oopsing on ocfs2 because we were passing in a
NULL file pointer.

Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>
---
 fs/ocfs2/file.c | 2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 061591a..8f30e74 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -2012,7 +2012,7 @@ static long ocfs2_fallocate(struct file *file, int mode, loff_t offset,
  sr.l_start = (s64)offset;
  sr.l_len = (s64)len;

- return __ocfs2_change_file_space(NULL, inode, offset, cmd, &sr,
+ return __ocfs2_change_file_space(file, inode, offset, cmd, &sr,
       change_size);
 }

--
1.7.7.6

Source: http://oss.oracle.com/pipermail/ocfs2-devel/2012-January/008464.html
---
AlsaDevices:
 total 0
 crw-rw---T 1 root audio 116, 1 Mai 29 18:50 seq
 crw-rw---T 1 root audio 116, 33 Mai 29 18:50 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.0.1-0ubuntu6
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
DistroRelease: Ubuntu 12.04
HibernationDevice: RESUME=UUID=b3794f2e-be61-47f9-9de4-6c30dfcb534f
InstallationMedia: Ubuntu-Server 12.04 LTS "Precise Pangolin" - Alpha amd64 (20120325)
Lsusb:
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: VMware, Inc. VMware Virtual Platform
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=xterm
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-24-generic root=/dev/mapper/base-root ro recovery nomodeset
ProcVersionSignature: Ubuntu 3.2.0-24.38-generic 3.2.16
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-24-generic N/A
 linux-backports-modules-3.2.0-24-generic N/A
 linux-firmware 1.79
RfKill: Error: [Errno 2] No such file or directory
Tags: precise
Uname: Linux 3.2.0-24-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

dmi.bios.date: 04/15/2011
dmi.bios.vendor: Phoenix Technologies LTD
dmi.bios.version: 6.00
dmi.board.name: 440BX Desktop Reference Platform
dmi.board.vendor: Intel Corporation
dmi.board.version: None
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 1
dmi.chassis.vendor: No Enclosure
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnPhoenixTechnologiesLTD:bvr6.00:bd04/15/2011:svnVMware,Inc.:pnVMwareVirtualPlatform:pvrNone:rvnIntelCorporation:rn440BXDesktopReferencePlatform:rvrNone:cvnNoEnclosure:ct1:cvrN/A:
dmi.product.name: VMware Virtual Platform
dmi.product.version: None
dmi.sys.vendor: VMware, Inc.

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1006012

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: precise
Revision history for this message
Jörg Lübbert (j-luebbert) wrote : AcpiTables.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Jörg Lübbert (j-luebbert) wrote : BootDmesg.txt

apport information

Revision history for this message
Jörg Lübbert (j-luebbert) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Jörg Lübbert (j-luebbert) wrote : IwConfig.txt

apport information

Revision history for this message
Jörg Lübbert (j-luebbert) wrote : Lspci.txt

apport information

Revision history for this message
Jörg Lübbert (j-luebbert) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Jörg Lübbert (j-luebbert) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Jörg Lübbert (j-luebbert) wrote : ProcModules.txt

apport information

Revision history for this message
Jörg Lübbert (j-luebbert) wrote : UdevDb.txt

apport information

Revision history for this message
Jörg Lübbert (j-luebbert) wrote : UdevLog.txt

apport information

Revision history for this message
Jörg Lübbert (j-luebbert) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.4kernel[1] (Not a kernel in the daily directory). Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the other tags). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-quantal/

Changed in linux (Ubuntu):
importance: Undecided → High
status: Confirmed → Incomplete
tags: added: needs-upstream-testing
tags: added: kernel-bug-exists-upstream
removed: needs-upstream-testing
Revision history for this message
Jörg Lübbert (j-luebbert) wrote :

Bug still exists in 3.4.0-030400-generic amd64 2012-05-21

Revision history for this message
Jörg Lübbert (j-luebbert) wrote :

Tested with custom kernel and modification from http://oss.oracle.com/pipermail/ocfs2-devel/2012-January/008464.html. Crash not happening anymore. Side effects unknown because of different source code differences.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Luis Henriques (henrix) wrote :

It looks like the patch you refer on your report hasn't been accepted upstreams, probably because it seems to be incorrect (although I'm not familiar with ocfs2 code).

I've tried to fix the problem and built a test kernel that I would like to ask you to try and see if it solves the problem. You can download the test kernel here:

http://people.canonical.com/~henrix/lp1006012-ocfs2-fix/

This URL also contains the patch I've used in this kernel.

If this kernel works for you, I'll try to push it into mainline.

(Note: I guess you've done this before trying kernel you referred in comment #15, but I would recommend you to backup your ocfs2 before trying to mount the filesystem using the test kernel)

Revision history for this message
Luis Henriques (henrix) wrote :

Jörg, did you had a chance to test the kernel in comment #16? Are you still able to reproduce this issue?

Revision history for this message
Jörg Lübbert (j-luebbert) wrote :

Unfortunately I haven't had the chance to try your patch yet. With the Oracle provided patch, I was able to extract the tarball and the system is running stable ever since. It will take me some hours to replicate the scenario from before, creating a similar tarball, similar system etc. I hope to find time on one of the coming weekends.

Revision history for this message
Luis Henriques (henrix) wrote :

I've managed to reproduce the issue and the patched kernel solves the issue.

The patch has already been submitted (and accepted) upstreams:

https://lkml.org/lkml/2012/6/20/686

Changed in linux (Ubuntu):
status: Confirmed → Triaged
Luis Henriques (henrix)
description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.5.0-3.3

---------------
linux (3.5.0-3.3) quantal-proposed; urgency=low

  [ Andy Whitcroft ]

  * [Config] enable CONFIG_MEMTEST=y
    - LP: #1004535
  * [Config] config-check: add support for a cut operation
  * [Config] enforcer -- switch to cut where appropriate

  [ Leann Ogasawara ]

  * Rebase to v3.5-rc5
  * [Config] Updateconfigs after rebase to v3.5-rc5

  [ Luis Henriques ]

  * SAUCE: ocfs2: Fix NULL pointer dereferrence in
    __ocfs2_change_file_space
    - LP: #1006012

  [ Seth Forshee ]

  * SAUCE: (drop after 3.5) drm/i915: ignore pipe select bit when checking
    for LVDS register initialization
    - LP: #1012800

  [ Upstream Kernel Changes ]

  * rebase to v3.5-rc5
    - LP: #1013183
    - LP: #1017017
    - LP: #884652
 -- Leann Ogasawara <email address hidden> Mon, 02 Jul 2012 06:41:58 -0700

Changed in linux (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Luis Henriques (henrix) wrote :

Running the test case defined above doesn't trigger the NULL pointer anymore with kernel 3.2.0-27.43. Tagging as verified for Precise.

tags: added: verification-done-precise
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (13.6 KiB)

This bug was fixed in the package linux - 3.2.0-27.43

---------------
linux (3.2.0-27.43) precise-proposed; urgency=low

  [ Andy Whitcroft ]

  * No change upload to fix .ddeb generation in the PPA.

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1020016

linux (3.2.0-27.42) precise-proposed; urgency=low

  [Luis Henriques]

  * Release Tracking Bug
    - LP: #1020016

  [ Chris J Arges ]

  * PACKAGING: add .gnu_debuglink sections to .ko files
    - LP: #669641

  [ Ike Panhc ]

  * [Config] Updateconfigs
    - LP: #1008345

  [ Luis Henriques ]

  * SAUCE: (upstreamed) [media] ene_ir: Fix driver initialisation
    - LP: #1014800
  * SAUCE: ocfs2: Fix NULL pointer dereferrence in
    __ocfs2_change_file_space
    - LP: #1006012

  [ Rob Herring ]

  * SAUCE: net: calxedaxgmac: enable rx cut-thru mode
    - LP: #1008345
  * SAUCE: EDAC: Add support for the highbank platform memory
    - LP: #1008345
  * SAUCE: EDAC: add support for highbank platform L2 cache ecc
    - LP: #1008345

  [ Seth Forshee ]

  * (pre-stable): bcma: add ext PA workaround for BCM4331 and BCM43431
    - LP: #925577

  [ Takashi Iwai ]

  * SAUCE: ALSA: hda - Fix power-map regression for HP dv6 & co
    - LP: #1013183

  [ Tim Gardner ]

  * [Config] Enable CONFIG_CGROUPS for highbank
    - LP: #1014692

  [ Upstream Kernel Changes ]

  * Revert "net: maintain namespace isolation between vlan and real device"
    - LP: #1013723
  * x86/amd: Re-enable CPU topology extensions in case BIOS has disabled it
    - LP: #1009087
  * hwmon: (k10temp) Add support for AMD Trinity CPUs
    - LP: #1009086
  * hwmon: (fam15h_power) Increase output resolution
    - LP: #1009086
  * Input: wacom - use BTN_TOOL_FINGER to indicate touch device type
    - LP: #1009435
  * Input: wacom - use switch statement for wacom_tpc_irq()
    - LP: #1009435
  * Input: wacom - isolate input registration
    - LP: #1009435
  * Input: wacom - wireless monitor framework
    - LP: #1009435
  * Input: wacom - create inputs when wireless connect
    - LP: #1009435
  * Input: wacom - wireless battery status
    - LP: #1009435
  * Input: wacom - check for allocation failure in probe()
    - LP: #1009435
  * Input: wacom - add basic Intuos5 support
    - LP: #1009435
  * Input: wacom - add Intuos5 Touch Ring/ExpressKey support
    - LP: #1009435
  * Input: wacom - add Intuos5 Touch Ring LED support
    - LP: #1009435
  * Input: wacom - add Intuos5 multitouch sensor support
    - LP: #1009435
  * iommu/amd: Add workaround for event log erratum
    - LP: #1013723
  * MIPS: BCM63XX: Add missing include for bcm63xx_gpio.h
    - LP: #1013723
  * cifs: Include backup intent search flags during searches {try #2)
    - LP: #1013723
  * sunrpc: fix loss of task->tk_status after rpc_delay call in
    xprt_alloc_slot
    - LP: #1013723
  * exofs: Fix CRASH on very early IO errors.
    - LP: #1013723
  * cifs: fix oops while traversing open file list (try #4)
    - LP: #1013723
  * Fix dm-multipath starvation when scsi host is busy
    - LP: #1013723
  * ixp4xx: fix compilation by adding gpiolib support
    - LP: #1013723
  * drm/i915: properly handle interlaced bit for sdvo dtd conversion
    - LP: #1013723
...

Changed in linux (Ubuntu Precise):
status: New → Fix Released
Revision history for this message
Adam Conrad (adconrad) wrote : Update Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.