XFS: memory allocation deadlock in kmem_alloc (mode:0x8250)
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | linux (Ubuntu) |
Undecided
|
Unassigned | ||
| | Trusty |
Undecided
|
Rafael David Tinoco | ||
| | Utopic |
Undecided
|
Unassigned | ||
Bug Description
=== SRU Justification ===
Impact: xfs can hang on lack of contiguous memory page to be allocated.
Fix: upstream patch (b3f03bac813220
Testcase:
- buddyinfo showing lack of contiguous blocks to be allocated (fragmented memory)
- to create 1 million files in 1 directory (attached script as an example)
- to observe message: XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
=== Original Description ===
It was brought to my attention the following situation:
http://
Precise kernel does not have XFS fix for kmem_alloc deadlock and users are facing this problem.
Output example:
"""
NFO: task ceph-osd:17047 blocked for more than 120 seconds.
[153972.073476] "echo 0 > /proc/sys/
[153972.076322] ceph-osd D ffff880869a28a60 0 17047 5423 0x00000000
[153972.076324] ffff880869a28750 0000000000000002 ffff880867788ee8 ffff8807e4e47500
[153972.079259] ffff880517addfd8 ffff880517addfd8 ffff880517addfd8 ffff880869a28750
[153972.082244] 0000000000000004 ffff880517addd48 ffff880517addd50 7fffffffffffffff
[153972.085278] Call Trace:
[153972.088310] [<ffffffff81410
[153972.091371] [<ffffffff8100a
[153972.094386] [<ffffffff81070
[153972.097358] [<ffffffff81412
[153972.100322] [<ffffffff81070
[153972.103292] [<ffffffff811a3
[153972.106212] [<ffffffff811a3
[153972.109085] [<ffffffff8134e
[153972.111954] [<ffffffff813eb
[153972.114839] [<ffffffff81053
[153972.117743] [<ffffffff81002
[153972.120632] [<ffffffff81052
[153972.123521] [<ffffffff81002
[153972.126400] [<ffffffff81414
[153972.299643] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
[153972.868782] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
[153973.038189] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
[153974.309978] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
"""
Fix, already included in Utopic, is upstream commit: b3f03bac8132207
Author: Dave Chinner <email address hidden>
Date: Tue Dec 3 23:50:57 2013 +1100
xfs: xfs_dir2_
If we are using a large directory block size, and memory becomes
fragmented, we can get memory allocation failures trying to
kmem_alloc(64k) for a temporary buffer. However, there is not need
for a directory buffer sized allocation, as the end result ends up
in the inode literal area. This is, at most, slightly less than 2k
of space, and hence we don't need an allocation larger than that
fora temporary buffer.
Signed-off-by: Dave Chinner <email address hidden>
Reviewed-by: Ben Myers <email address hidden>
Signed-off-by: Ben Myers <email address hidden>
CVE References
| Changed in linux (Ubuntu): | |
| status: | New → Confirmed |
| assignee: | nobody → Rafael David Tinoco (inaddy) |
| Rafael David Tinoco (inaddy) wrote : | #1 |
| Changed in linux (Ubuntu Trusty): | |
| assignee: | nobody → Rafael David Tinoco (inaddy) |
| status: | New → Confirmed |
| Changed in linux (Ubuntu Utopic): | |
| assignee: | Rafael David Tinoco (inaddy) → Chris J Arges (arges) |
| assignee: | Chris J Arges (arges) → nobody |
| status: | Confirmed → Fix Released |
| Rafael David Tinoco (inaddy) wrote : | #3 |
Attached script as an example on how to make kmem_alloc to fail (using /var/lib/ceph as filesystem, it can be changed).
| description: | updated |
| Rafael David Tinoco (inaddy) wrote : | #4 |
This patch was tested and seems to fix the issue. Waiting for the SRU (already sent to kernel-team mailing list).
Thank you!
Rafael Tinoco
| tags: | added: cfs |
| tags: |
added: cts removed: cfs |
| Changed in linux (Ubuntu Trusty): | |
| status: | Confirmed → Fix Committed |
| Brad Figg (brad-figg) wrote : | #5 |
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/
| tags: | added: verification-needed-trusty |
| Launchpad Janitor (janitor) wrote : | #6 |
This bug was fixed in the package linux - 3.13.0-40.69
---------------
linux (3.13.0-40.69) trusty; urgency=low
[ Luis Henriques ]
* Release Tracking Bug
- re-used previous tracking bug
[ Upstream Kernel Changes ]
* regmap: fix kernel hang on regmap_bulk_write with zero val_count.
linux (3.13.0-40.68) trusty; urgency=low
[ Brad Figg ]
* Release Tracking Bug
- LP: #1388943
* SAUCE: DEP8 test to run our regression tests
- LP: #1385330
* SAUCE: The very first thing we should do when testing is make sure we
are testing the correct kernel
- LP: #1385330
[ dann frazier ]
* [Config] Disable CONFIG_
- LP: #1388952
[ Duc Dang ]
* SAUCE: (no-up) [PCIE] APM X-Gene: Remove debug messages in MSI
interrupt handler path.
- LP: #1382244
* SAUCE: (no-up) PCI: X-Gene: Fix max payload size and phantom function
configuration
- LP: #1386261
[ McAulay, Alistair ]
* SAUCE: drm/i915: Rework GPU reset sequence to match driver load & thaw
- LP: #1384469
[ Timo Aaltonen ]
* SAUCE: i915_bdw: Fix cherry-pick typo
- LP: #1384469
[ Upstream Kernel Changes ]
* Revert "mac80211: disable uAPSD if all ACs are under ACM"
- LP: #1381234
* Revert "iwlwifi: dvm: don't enable CTS to self"
- LP: #1381234
* Revert "lzo: properly check for overruns"
- LP: #1387886
* drm/i915: provide interface for audio driver to query cdclk
- LP: #1381168
* regulatory: add NUL to alpha2
- LP: #1381234
* percpu: fix pcpu_alloc_pages() failure path
- LP: #1381234
* percpu: perform tlb flush after pcpu_map_pages() failure
- LP: #1381234
* cgroup: reject cgroup names with '\n'
- LP: #1381234
* vfs: add d_is_dir()
- LP: #1381234
* CIFS: Fix directory rename error
- LP: #1381234
* usb: phy: twl4030-usb: Fix lost interrupts after ID pin goes down
- LP: #1381234
* rtlwifi: rtl8192cu: Add new ID
- LP: #1381234
* CIFS: Fix wrong restart readdir for SMB1
- LP: #1381234
* CIFS: Fix wrong filename length for SMB2
- LP: #1381234
* ahci: Add Device IDs for Intel 9 Series PCH
- LP: #1381234
* ata_piix: Add Device IDs for Intel 9 Series PCH
- LP: #1381234
* USB: zte_ev: fix removed PIDs
- LP: #1381234
* USB: ftdi_sio: add support for NOVITUS Bono E thermal printer
- LP: #1381234
* USB: sierra: avoid CDC class functions on "68A3" devices
- LP: #1381234
* USB: sierra: add 1199:68AA device ID
- LP: #1381234
* iommu/arm-smmu: fix programming of SMMU_CBn_TCR for stage 1
- LP: #1381234
* iommu/arm-smmu: remove pgtable_
- LP: #1381234
* usb: gadget: fusb300_udc.h: Fix typo in include guard
- LP: #1381234
* usb: phy: tegra: Avoid use of sizeof(void)
- LP: #1381234
* arm64: use irq_set_affinity with force=false when migrating irqs
- LP: #1381234
* block: Fix dev_t minor allocation lifetime
- LP: #1381234
* usb: dwc3: core: fix order of PM runtime calls
- LP: #1381234
* usb: dwc3: core: fix ordering for PHY suspend
- LP: #1381234
* usb: dwc3: omap: fix ordering for runtime pm calls
- LP: #1381234
* ...
| Changed in linux (Ubuntu Trusty): | |
| status: | Fix Committed → Fix Released |


I've made available a temporary hotfix for this issue:
https:/ /launchpad. net/~inaddy/ +archive/ ubuntu/ lp1382333
(available in some hours from now)
Anyone who suffer from this issue is welcome to give me feedback on the hotfix.
Thank you very much
Rafael Tinoco