XFS: memory allocation deadlock in kmem_alloc (mode:0x8250)

Bug #1382333 reported by Rafael David Tinoco on 2014-10-17
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned
Trusty
Undecided
Rafael David Tinoco
Utopic
Undecided
Unassigned

Bug Description

=== SRU Justification ===

Impact: xfs can hang on lack of contiguous memory page to be allocated.
Fix: upstream patch (b3f03bac8132207a20286d5602eda64500c19724).
Testcase:
 - buddyinfo showing lack of contiguous blocks to be allocated (fragmented memory)
 - to create 1 million files in 1 directory (attached script as an example)
 - to observe message: XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)

=== Original Description ===

It was brought to my attention the following situation:

http://tracker.ceph.com/issues/6301

Precise kernel does not have XFS fix for kmem_alloc deadlock and users are facing this problem.

Output example:

"""
NFO: task ceph-osd:17047 blocked for more than 120 seconds.
[153972.073476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[153972.076322] ceph-osd D ffff880869a28a60 0 17047 5423 0x00000000
[153972.076324] ffff880869a28750 0000000000000002 ffff880867788ee8 ffff8807e4e47500
[153972.079259] ffff880517addfd8 ffff880517addfd8 ffff880517addfd8 ffff880869a28750
[153972.082244] 0000000000000004 ffff880517addd48 ffff880517addd50 7fffffffffffffff
[153972.085278] Call Trace:
[153972.088310] [<ffffffff81410f4a>] ? schedule_timeout+0x1ca/0x270
[153972.091371] [<ffffffff8100abf1>] ? native_sched_clock+0x11/0x70
[153972.094386] [<ffffffff81070cda>] ? try_to_wake_up+0x1ea/0x270
[153972.097358] [<ffffffff81412623>] ? wait_for_completion+0xa3/0x120
[153972.100322] [<ffffffff81070d60>] ? try_to_wake_up+0x270/0x270
[153972.103292] [<ffffffff811a3702>] ? do_coredump+0x1b2/0xee0
[153972.106212] [<ffffffff811a3899>] ? do_coredump+0x349/0xee0
[153972.109085] [<ffffffff8134e0c4>] ? skb_queue_tail+0x24/0x60
[153972.111954] [<ffffffff813eb73a>] ? unix_dgram_sendmsg+0x5aa/0x640
[153972.114839] [<ffffffff81053049>] ? get_signal_to_deliver+0x199/0x5a0
[153972.117743] [<ffffffff81002353>] ? do_signal+0x63/0x8c0
[153972.120632] [<ffffffff81052030>] ? do_send_sig_info+0x60/0x90
[153972.123521] [<ffffffff81002c38>] ? do_notify_resume+0x88/0xa0
[153972.126400] [<ffffffff81414d6a>] ? int_signal+0x12/0x17
[153972.299643] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
[153972.868782] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
[153973.038189] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
[153974.309978] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
"""

Fix, already included in Utopic, is upstream commit: b3f03bac8132207a20286d5602eda64500c19724

Author: Dave Chinner <email address hidden>
Date: Tue Dec 3 23:50:57 2013 +1100

    xfs: xfs_dir2_block_to_sf temp buffer allocation fails

    If we are using a large directory block size, and memory becomes
    fragmented, we can get memory allocation failures trying to
    kmem_alloc(64k) for a temporary buffer. However, there is not need
    for a directory buffer sized allocation, as the end result ends up
    in the inode literal area. This is, at most, slightly less than 2k
    of space, and hence we don't need an allocation larger than that
    fora temporary buffer.

    Signed-off-by: Dave Chinner <email address hidden>
    Reviewed-by: Ben Myers <email address hidden>
    Signed-off-by: Ben Myers <email address hidden>

Changed in linux (Ubuntu):
status: New → Confirmed
assignee: nobody → Rafael David Tinoco (inaddy)

I've made available a temporary hotfix for this issue:

https://launchpad.net/~inaddy/+archive/ubuntu/lp1382333
(available in some hours from now)

Anyone who suffer from this issue is welcome to give me feedback on the hotfix.

Thank you very much

Rafael Tinoco

Chris J Arges (arges) on 2014-10-17
Changed in linux (Ubuntu Trusty):
assignee: nobody → Rafael David Tinoco (inaddy)
status: New → Confirmed
Changed in linux (Ubuntu Utopic):
assignee: Rafael David Tinoco (inaddy) → Chris J Arges (arges)
assignee: Chris J Arges (arges) → nobody
status: Confirmed → Fix Released

Attached script as an example on how to make kmem_alloc to fail (using /var/lib/ceph as filesystem, it can be changed).

description: updated

This patch was tested and seems to fix the issue. Waiting for the SRU (already sent to kernel-team mailing list).

Thank you!

Rafael Tinoco

tags: added: cfs
tags: added: cts
removed: cfs
Andy Whitcroft (apw) on 2014-10-28
Changed in linux (Ubuntu Trusty):
status: Confirmed → Fix Committed
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
Launchpad Janitor (janitor) wrote :
Download full text (22.1 KiB)

This bug was fixed in the package linux - 3.13.0-40.69

---------------
linux (3.13.0-40.69) trusty; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - re-used previous tracking bug

  [ Upstream Kernel Changes ]

  * regmap: fix kernel hang on regmap_bulk_write with zero val_count.

linux (3.13.0-40.68) trusty; urgency=low

  [ Brad Figg ]

  * Release Tracking Bug
    - LP: #1388943
  * SAUCE: DEP8 test to run our regression tests
    - LP: #1385330
  * SAUCE: The very first thing we should do when testing is make sure we
    are testing the correct kernel
    - LP: #1385330

  [ dann frazier ]

  * [Config] Disable CONFIG_IPMI_SI_PROBE_DEFAULTS on armhf and arm64
    - LP: #1388952

  [ Duc Dang ]

  * SAUCE: (no-up) [PCIE] APM X-Gene: Remove debug messages in MSI
    interrupt handler path.
    - LP: #1382244
  * SAUCE: (no-up) PCI: X-Gene: Fix max payload size and phantom function
    configuration
    - LP: #1386261

  [ McAulay, Alistair ]

  * SAUCE: drm/i915: Rework GPU reset sequence to match driver load & thaw
    - LP: #1384469

  [ Timo Aaltonen ]

  * SAUCE: i915_bdw: Fix cherry-pick typo
    - LP: #1384469

  [ Upstream Kernel Changes ]

  * Revert "mac80211: disable uAPSD if all ACs are under ACM"
    - LP: #1381234
  * Revert "iwlwifi: dvm: don't enable CTS to self"
    - LP: #1381234
  * Revert "lzo: properly check for overruns"
    - LP: #1387886
  * drm/i915: provide interface for audio driver to query cdclk
    - LP: #1381168
  * regulatory: add NUL to alpha2
    - LP: #1381234
  * percpu: fix pcpu_alloc_pages() failure path
    - LP: #1381234
  * percpu: perform tlb flush after pcpu_map_pages() failure
    - LP: #1381234
  * cgroup: reject cgroup names with '\n'
    - LP: #1381234
  * vfs: add d_is_dir()
    - LP: #1381234
  * CIFS: Fix directory rename error
    - LP: #1381234
  * usb: phy: twl4030-usb: Fix lost interrupts after ID pin goes down
    - LP: #1381234
  * rtlwifi: rtl8192cu: Add new ID
    - LP: #1381234
  * CIFS: Fix wrong restart readdir for SMB1
    - LP: #1381234
  * CIFS: Fix wrong filename length for SMB2
    - LP: #1381234
  * ahci: Add Device IDs for Intel 9 Series PCH
    - LP: #1381234
  * ata_piix: Add Device IDs for Intel 9 Series PCH
    - LP: #1381234
  * USB: zte_ev: fix removed PIDs
    - LP: #1381234
  * USB: ftdi_sio: add support for NOVITUS Bono E thermal printer
    - LP: #1381234
  * USB: sierra: avoid CDC class functions on "68A3" devices
    - LP: #1381234
  * USB: sierra: add 1199:68AA device ID
    - LP: #1381234
  * iommu/arm-smmu: fix programming of SMMU_CBn_TCR for stage 1
    - LP: #1381234
  * iommu/arm-smmu: remove pgtable_page_{c,d}tor()
    - LP: #1381234
  * usb: gadget: fusb300_udc.h: Fix typo in include guard
    - LP: #1381234
  * usb: phy: tegra: Avoid use of sizeof(void)
    - LP: #1381234
  * arm64: use irq_set_affinity with force=false when migrating irqs
    - LP: #1381234
  * block: Fix dev_t minor allocation lifetime
    - LP: #1381234
  * usb: dwc3: core: fix order of PM runtime calls
    - LP: #1381234
  * usb: dwc3: core: fix ordering for PHY suspend
    - LP: #1381234
  * usb: dwc3: omap: fix ordering for runtime pm calls
    - LP: #1381234
  * ...

Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments