[patch] fix improper merge with overlayfs.

Bug #1122094 reported by Roman Valov on 2013-02-11
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Andy Whitcroft

Bug Description

It seems that during merge of this commit:
http://git.kernel.org/?p=linux/kernel/git/mszeredi/vfs.git;a=commitdiff;h=305101a16a69df2df4edc6204fe699c88aaf66c8
into quantal kernel:
http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-quantal.git;a=commitdiff;h=24c353a2ee733ace8bd74f35fb48d792e4b22ff7
there was a poor conflict resolution.

The point is that in vanilla 3.5 kernel the call to '__dentry_open' inside 'nameidata_to_filp' was expanded into separate call to 'do_dentry_open' and result post-processing. In the merged patch 'vfs_open' substitutes 'do_dentry_open' (instead of '__dentry_open') call resulting into post-processing done twice (in '__dentry_open' and 'nameidata_to_filp' itself). It leads to situation where filp is freed (after 'put_filp' in '__dentry_open') but further code assumes it's still alive even if error occurs.

Under heavy loads my system resulting effect is race condition where assumed to be alive filp has been already put and alloc'ed by other thread. Finally that leads to various kernel panics.

Attached patch expand 'vfs_open' call inside 'nameidata_to_filp' to fit behavior in vanilla kernel if 'i_op->open' is not provided.

Roman Valov (reddot) wrote :

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1122094

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

affects: linux-meta (Ubuntu) → linux (Ubuntu)
Changed in linux (Ubuntu):
status: New → Incomplete
Roman Valov (reddot) wrote :

Is it possible to skip apport-collect because of diagnosing and resolving already done and solution has been provided?

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: patch
Joseph Salisbury (jsalisbury) wrote :

Can you provide some information on the status of the patch with regards to getting it merged upstream? Has it been sent upstream, what sort of feedback has it received, is it getting applied to a subsystem maintainer's tree, etc?

People affected by this bug are probably wondering why the kernel team doesn't just apply the patch and fix it. The reason is that the kernel team is reluctant (not opposed) to apply any patch to a stable kernel that is not from upstream. Applying patches that don't come from upstream add greatly to the support of the kernel as other upstream patches may touch the same area as the non-upstream patch and may prevent them from applying cleanly.

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key quantal
Roman Valov (reddot) wrote :

Joseph, please refer to the commit link on ubuntu kernel's git at the bug description. That's the point where the problem was introduced. Thus neither upstream vanilla 3.5 kernel, nor mszeredi's overlayfs kernel doesn't have reported problem. The merge conflict pointed at bug description was introduced due to mszeredi's overlayfs v13 was based on kernel 3.4. And both, mszeredi's overlayfs v13 patch and 3.4 to 3.5 patch modify 'nameidata_to_filp' behavior. Next version of overlayfs v14 is based on kernel 3.6, where open.c was greatly modified and there is no 'nameidata_to_filp' function.

Your arguments about kernel patching are reasonable. So I will provide links to various variants of open.c and ask you to investigate how 'nameidata_to_filp' is working at different kernels:

----

vanilla v3.4: http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=blob;f=fs/open.c;h=5720854156dbd61e28598da83a529294401eb119;hb=76e10d158efb6d4516018846f60c2ab5501900bc

call sequence: nameidata_to_filp -> __dentry_open -> all dirty stuff here

----

overlayfs v13: http://git.kernel.org/?p=linux/kernel/git/mszeredi/vfs.git;a=blob;f=fs/open.c;h=3e132ba8576030460673c1b75f8b4fdbee4bb930;hb=refs/heads/overlayfs.v13

call sequence: nameidata_to_filp -> vfs_open -> __dentry_open -> all dirty stuff here

----

vanilla v3.5: http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=blob;f=fs/open.c;h=1540632d8387fe98a51d0193201346acb18ae70e;hb=28a33cbc24e4256c143dce96c7d93bf423229f92

call sequence: nameidata_to_filp [post-proc] -> do_dentry_open -> most of dirty stuff here

----

ubuntu quantal: http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-quantal.git;a=blob;f=fs/open.c;h=7be104a4e0aecbd3ecd7df20a2d3f7825a795af7;hb=HEAD

call sequence: nameidata_to_filp [post-proc] -> vfs_open -> __dentry_open [post-proc] -> do_dentry_open -> most of dirty stuff here

----

Well, in v3.4 and overlayfs v13 "all dirty stuff" was done at the '__dentry_open'. In v3.5 that "all dirty stuff" was split into "most of dirty stuff" in the 'do_dentry_open' and other part of "all dirty stuff" (post-processing) was left at the '__dentry_open'. Besides this post-processing stuff was duplicated at 'nameidata_to_filp' and 'nameidata_to_filp' calls only 'do_dentry_open', not '__dentry_open'.

So at vanilla v3.5 'nameidata_to_filp' now calls only 'do_dentry_open' and performs post-processing by itself. In quantal's kernel (merged from v3.5 and overlayfs v13) post-processing done at 'nameidata_to_filp' like in vanilla kernel. But also post-processing done at '__dentry_open' called by 'vfs_open' ported from overlayfs v13.

----

I hope this information and investigating on provided points will be enough to accept the patch.

Andy Whitcroft (apw) on 2013-02-13
Changed in linux (Ubuntu):
assignee: nobody → Andy Whitcroft (apw)
status: Confirmed → In Progress
Andy Whitcroft (apw) wrote :

It does look suspect at first looking. I am somewhat supprised the OS works at all given this appears to affect the non-overlayfs path. Will review and test.

Andy Whitcroft (apw) wrote :

@reddot -- I have gone back to the merge and it does seem wrong indeed. I have independantly merged it in light of your report and I believe my patch matches yours. I have applied this to the quantal kernel source and produced test kernels as below:

    http://people.canonical.com/~apw/lp1122094-quantal/

Could you test those in your stress environment and confirm if this kernel also works for you. If you could report any testing back here that would be great. If this passes testing for you, we can talk about pushing it to the main kernel repo.

Roman Valov (reddot) wrote :

Andy, at the moment I've run both i386 and amd64 kernels in my environment for about 40 hours with no regressions. However I'm going to left them run for weekend. So, let's wait till Monday.

Roman Valov (reddot) wrote :

Well, provided kernels has been robustly working till the moment. With respect to the fact that unpatched kernels were producing panic in average 30 minutes I assume these kernels as stable.

Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed' to 'verification-done'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-quantal
Roman Valov (reddot) on 2013-03-05
tags: added: verification-done-quantal
removed: verification-needed-quantal

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Launchpad Janitor (janitor) wrote :
Download full text (13.8 KiB)

This bug was fixed in the package linux - 3.5.0-26.42

---------------
linux (3.5.0-26.42) quantal-proposed; urgency=low

  [Steve Conklin]

  * Release Tracking Bug
    - LP: #1152715

  [ Andy Whitcroft ]

  * ubuntu: overlayfs -- fix missmerge of vfs_open changes
    - LP: #1122094, #1147678

linux (3.5.0-26.40) quantal-proposed; urgency=low

  [Brad Figg]

  * Release Tracking Bug
    - LP: #1133429

  [ Andy Whitcroft ]

  * ubuntu: overlayfs -- fix missmerge of vfs_open changes
    - LP: #1122094

  [ Ian Campbell ]

  * SAUCE: xen/netback: shutdown the ring if it contains garbage.
    - LP: #1117325
    - CVE-2013-0216
  * SAUCE: netback: correct netbk_tx_err to handle wrap around.
    - LP: #1117325
    - CVE-2013-0216
  * SAUCE: xen/netback: don't leak pages on failure in
    xen_netbk_tx_check_gop.
    - LP: #1117331
    - CVE-2013-0217
  * SAUCE: xen/netback: free already allocated memory on failure in
    xen_netbk_get_requests
    - LP: #1117331
    - CVE-2013-0217

  [ Jan Beulich ]

  * SAUCE: xen-pciback: rate limit error messages from
    xen_pcibk_enable_msi{, x}()
    - LP: #1117336
    - CVE-2013-0231

  [ Tim Gardner ]

  * [Config] CONFIG_SATA_AHCI=m
    - LP: #1056563
  * SAUCE: rt2x00: rt2x00pci_regbusy_read() - only print register access
    failure once
    - LP: #1128840

  [ Upstream Kernel Changes ]

  * Revert "USB: Handle warm reset failure on empty port."
    - LP: #1131944
  * xen: Fix stack corruption in xen_failsafe_callback for 32bit PVOPS
    guests.
    - LP: #1102374
    - CVE-2013-0190
  * virtio-blk: Don't free ida when disk is in use
    - LP: #1119885
  * ioat: Fix DMA memory sync direction correct flag
    - LP: #1119885
  * PCI: pciehp: Use per-slot workqueues to avoid deadlock
    - LP: #1119885
  * PCI/AER: pci_get_domain_bus_and_slot() call missing required
    pci_dev_put()
    - LP: #1119885
  * xen/grant-table: correctly initialize grant table version 1
    - LP: #1119885
  * serial:ifx6x60:Delete SPI timer when shut down port
    - LP: #1119885
  * tty: 8250_dw: Fix inverted arguments to serial_out in IRQ handler
    - LP: #1119885
  * drm/i915: Invalidate the relocation presumed_offsets along the slow
    path
    - LP: #1119885
  * ARM: 7627/1: Predicate preempt logic on PREEMP_COUNT not PREEMPT alone
    - LP: #1119885
  * staging: vt6656: Fix inconsistent structure packing
    - LP: #1119885
  * 8250/16?50: Add support for Broadcom TruManage redirected serial port
    - LP: #1119885
  * KVM: PPC: Emulate dcbf
    - LP: #1119885
  * staging: wlan-ng: Fix clamping of returned SSID length
    - LP: #1119885
  * USB: option: blacklist network interface on ONDA MT8205 4G LTE
    - LP: #1119885
  * USB: option: add TP-LINK HSUPA Modem MA180
    - LP: #1119885
  * ALSA: hda - Fix mute led for another HP machine
    - LP: #1096789, #1119885
  * usb: dwc3: gadget: fix ep->maxburst for ep0
    - LP: #1119885
  * ACPI / cpuidle: Fix NULL pointer issues when cpuidle is disabled
    - LP: #1119885
  * ACPI / processor: Get power info before updating the C-states
    - LP: #1119885
  * ARM: DMA: Fix struct page iterator in dma_cache_maint() to work with
    sparsemem
    - LP: #1119885
  * evm: checki...

Changed in linux (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers