Fix OOM errors

Bug #1712598 reported by Paolo Pisati
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-raspi2 (Ubuntu)
Invalid
Undecided
Unassigned
Xenial
Fix Released
Undecided
Unassigned

Bug Description

SRU:
The original report for this bug is this one:

http://bugs.launchpad.net/bugs/1655842

but since the raspi2 kernel has a well isolated solution that doesn't affect (and doesn't apply) to the other kernels mentioned there, i decided to spawn this bug and use it as a reference instead of polluting the above LP bug with details and information that don't pertain there.

Impact:
People are reporting OOM errors on RaspberryPI2/3, in particular when running
KDE or chromium, and the problem disappears when they go back to a kernel previous
than 4.4.0-1044.51.

This is actually a fallout from a previous attempt to fix a memory corruption in the usb stack and that was triggered during boot when mmc was mounted:

BugLink: http://bugs.launchpad.net/bugs/1665280

While trying to 'fix' the above problem, some patches that deal with OOM and memory presseure situaions were reverted and led to this situation - by reverting back these changes we fix the OOM errors (see comments #90, #91, #92 and #93 in the original LP bug) as reported by users, while the memory corruption problem doesn't show up anymore.

Fix:
Test a kernel with the following reverts reverted:

080aca8 Revert "mm: consider compaction feedback also for costly allocation"
486bab1 Revert "mm, oom, compaction: prevent from should_compact_retry looping for ever for costly orders"
7b84469 Revert "mm, oom: protect !costly allocations some more for !CONFIG_COMPACTION"
19724e4 Revert "mm, oom: prevent premature OOM killer invocation for high order request"
4b8b650 Revert "PM / wakeirq: Fix dedicated wakeirq for drivers not using autosuspend"

How to test:
People have reported out of memory erros while running KDE and/or chromium, so the best way to reproduce this problem is to install KDE, open ~20 konqueror windows, execute chromium, and opens ~10 tabs on different web sites - if the kernel doesn't oops, the fix is working.

CVE References

Revision history for this message
Paolo Pisati (p-pisati) wrote :
Revision history for this message
Paolo Pisati (p-pisati) wrote :
Revision history for this message
Paolo Pisati (p-pisati) wrote :
Revision history for this message
Paolo Pisati (p-pisati) wrote :
Revision history for this message
Paolo Pisati (p-pisati) wrote :
tags: added: patch
Changed in linux-raspi2 (Ubuntu Xenial):
status: New → Fix Committed
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Paolo Pisati (p-pisati)
tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (15.3 KiB)

This bug was fixed in the package linux-raspi2 - 4.4.0-1074.82

---------------
linux-raspi2 (4.4.0-1074.82) xenial; urgency=low

  * linux-raspi2: 4.4.0-1074.82 -proposed tracker (LP: #1716618)

  [ Ubuntu: 4.4.0-96.119 ]

  * linux: 4.4.0-96.119 -proposed tracker (LP: #1716613)
  * kernel panic -not syncing: Fatal exception: panic_on_oops (LP: #1708399)
    - s390/mm: no local TLB flush for clearing-by-ASCE IDTE
    - SAUCE: s390/mm: fix local TLB flushing vs. detach of an mm address space
    - SAUCE: s390/mm: fix race on mm->context.flush_mm
  * CVE-2017-1000251
    - Bluetooth: Properly check L2CAP config option output buffer length

linux-raspi2 (4.4.0-1073.81) xenial; urgency=low

  * linux-raspi2: 4.4.0-1073.81 -proposed tracker (LP: #1715653)

  [ Ubuntu: 4.4.0-95.118 ]

  * linux: 4.4.0-95.118 -proposed tracker (LP: #1715651)
  * Xenial update to 4.4.78 stable release broke Address Sanitizer
    (LP: #1715636)
    - mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes

linux-raspi2 (4.4.0-1072.80) xenial; urgency=low

  * linux-raspi2: 4.4.0-1072.80 -proposed tracker (LP: #1713464)

  * Include Broadcom GPL modules in Xenial Kernel (LP: #1665783)
    - [config] update config for master changes

  * Fix OOM errors (LP: #1712598)
    - Revert "Revert "mm: consider compaction feedback also for costly
      allocation""
    - Revert "Revert "mm, oom, compaction: prevent from should_compact_retry
      looping for ever for costly orders""
    - Revert "Revert "mm, oom: protect !costly allocations some more for
      !CONFIG_COMPACTION""
    - Revert "Revert "mm, oom: prevent premature OOM killer invocation for high
      order request""
    - Revert "Revert "PM / wakeirq: Fix dedicated wakeirq for drivers not using
      autosuspend""

  [ Ubuntu: 4.4.0-94.117 ]

  * linux: 4.4.0-94.117 -proposed tracker (LP: #1713462)
  * mwifiex causes kernel oops when AP mode is enabled (LP: #1712746)
    - SAUCE: net/wireless: do not dereference invalid pointer
    - SAUCE: mwifiex: do not dereference invalid pointer
  * Backport more recent Broadcom bnxt_en driver (LP: #1711056)
    - SAUCE: bnxt_en_bpo: Import bnxt_en driver version 1.8.1
    - SAUCE: bnxt_en_bpo: Drop distro out-of-tree detection logic
    - SAUCE: bnxt_en_bpo: Remove unnecessary compile flags
    - SAUCE: bnxt_en_bpo: Move config settings to Kconfig
    - SAUCE: bnxt_en_bpo: Remove PCI_IDs handled by the regular driver
    - SAUCE: bnxt_en_bpo: Rename the backport driver to bnxt_en_bpo
    - bnxt_en_bpo: [Config] Enable CONFIG_BNXT_BPO=m
  * HID: multitouch: Support ALPS PTP Stick and Touchpad devices (LP: #1712481)
    - HID: multitouch: Support PTP Stick and Touchpad device
    - SAUCE: HID: multitouch: Support ALPS PTP stick with pid 0x120A
  * igb: Support using Broadcom 54616 as PHY (LP: #1712024)
    - SAUCE: igb: add support for using Broadcom 54616 as PHY
  * IPR driver causes multipath to fail paths/stuck IO on Medium Errors
    (LP: #1682644)
    - scsi: ipr: do not set DID_PASSTHROUGH on CHECK CONDITION
  * accessing /dev/hvc1 with stress-ng on Ubuntu xenial causes crash
    (LP: #1711401)
    - tty/hvc: Use IRQF_SHARED for OPAL hvc consoles
  * memory-hotplug test...

Changed in linux-raspi2 (Ubuntu Xenial):
status: Fix Committed → Fix Released
Paolo Pisati (p-pisati)
Changed in linux-raspi2 (Ubuntu):
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.