KASan: out of bounds access in isolate_migratepages_range

Bug #1572562 reported by Gavin Guo on 2016-04-20
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Gavin Guo
Trusty
Medium
Gavin Guo

Bug Description

[Impact]
In the v3.13.0-76 kernel with KASan backported.
The following error message could be observed during the kernel
building stress test of the command[1]: "./parallel-73670.sh -r 2 -k 40"
That means building 40 kernels in the same time with 2 rounds.

Bad access happens when we read page->mapping->flags, and
page->mapping is a pointer to anon_vma which is already freed
in the do_exit path.

==================================================================
BUG: KASan: out of bounds access in isolate_migratepages_range+0x663/0xb30 at addr ffff880279cc76d1
Read of size 8 by task cc1/27473
=============================================================================
BUG anon_vma (Not tainted): kasan: bad access detected
-----------------------------------------------------------------------------

Disabling lock debugging due to kernel taint
INFO: Allocated in anon_vma_prepare+0x189/0x250 age=7323 cpu=16 pid=31029
        __slab_alloc+0x4f8/0x560
        kmem_cache_alloc+0x18b/0x1e0
        anon_vma_prepare+0x189/0x250
        do_wp_page+0x837/0xb10
        handle_mm_fault+0x884/0x1160
        __do_page_fault+0x218/0x750
        do_page_fault+0x1a/0x70
        page_fault+0x28/0x30
INFO: Freed in __put_anon_vma+0x69/0xe0 age=8588 cpu=4 pid=29418
        __slab_free+0x2ab/0x3f0
        kmem_cache_free+0x1c1/0x200
        __put_anon_vma+0x69/0xe0
        unlink_anon_vmas+0x2a8/0x320
        free_pgtables+0x50/0x1c0
        exit_mmap+0xca/0x1e0
        mmput+0x82/0x1b0
        do_exit+0x391/0x1060
        do_group_exit+0x86/0x130
        SyS_exit_group+0x1d/0x20
        system_call_fastpath+0x1a/0x1f
INFO: Slab 0xffffea0009e73100 objects=43 used=30 fp=0xffff880279cc67a8 flags=0x2ffff0000004080
INFO: Object 0xffff880279cc7658 @offset=13912 fp=0xffff880279cc7c38

Bytes b4 ffff880279cc7648: 10 00 00 00 5b 17 00 00 ef 25 6b 03 01 00 00 00 ....[....%k.....
Object ffff880279cc7658: 58 76 cc 79 02 88 ff ff 00 00 00 00 00 00 00 00 Xv.y............
Object ffff880279cc7668: 00 00 00 00 5a 5a 5a 5a 70 76 cc 79 02 88 ff ff ....ZZZZpv.y....
Object ffff880279cc7678: 70 76 cc 79 02 88 ff ff 01 00 00 00 03 00 00 00 pv.y............
Object ffff880279cc7688: 58 76 cc 79 02 88 ff ff b8 2a 20 31 02 88 ff ff Xv.y.....* 1....
CPU: 8 PID: 27473 Comm: cc1 Tainted: G B 3.13.0-76-generic #120hf00073670v20160120b0h5d3e6ab
Hardware name: Cisco Systems Inc UCSC-C220-M3L/UCSC-C220-M3L, BIOS C220M3.2.0.3.0.080120140402 08/01/2014
 ffffea0009e73100 ffff880736bbf750 ffffffff81a6e195 ffff8804e881b840
 ffff880736bbf780 ffffffff81244c1d ffff8804e881b840 ffffea0009e73100
 ffff880279cc7658 ffffea001aa99c98 ffff880736bbf7a8 ffffffff8124ad66
Call Trace:
 [<ffffffff81a6e195>] dump_stack+0x45/0x56
 [<ffffffff81244c1d>] print_trailer+0xfd/0x170
 [<ffffffff8124ad66>] object_err+0x36/0x40
 [<ffffffff8124cd29>] kasan_report_error+0x1e9/0x3a0
 [<ffffffff8125d9f8>] ? memcg_check_events+0x28/0x380
 [<ffffffff81221c2d>] ? rmap_walk+0x32d/0x340
 [<ffffffff8124d390>] kasan_report+0x40/0x50
 [<ffffffff81205ee3>] ? isolate_migratepages_range+0x663/0xb30
 [<ffffffff8124c019>] __asan_load8+0x69/0xa0
 [<ffffffff81205ee3>] isolate_migratepages_range+0x663/0xb30
 [<ffffffff811dc5e7>] ? zone_watermark_ok+0x57/0x70
 [<ffffffff812067c6>] compact_zone+0x416/0x700
 [<ffffffff81206b45>] compact_zone_order+0x95/0x100
 [<ffffffff81207002>] try_to_compact_pages+0x102/0x1a0
 [<ffffffff811e21e6>] __alloc_pages_direct_compact+0x96/0x290
 [<ffffffff811e2d5e>] __alloc_pages_nodemask+0x97e/0xc40
 [<ffffffff8123ce24>] alloc_pages_vma+0xb4/0x200
 [<ffffffff812572ca>] do_huge_pmd_anonymous_page+0x13a/0x490
 [<ffffffff8120f072>] ? do_numa_page+0x192/0x200
 [<ffffffff81210c07>] handle_mm_fault+0x267/0x1160
 [<ffffffff81a7d028>] __do_page_fault+0x218/0x750
 [<ffffffff8121aead>] ? do_mmap_pgoff+0x47d/0x500
 [<ffffffff811fd699>] ? vm_mmap_pgoff+0xa9/0xd0
 [<ffffffff81a7d57a>] do_page_fault+0x1a/0x70
 [<ffffffff81a785a8>] page_fault+0x28/0x30
Memory state around the buggy address:
 ffff880279cc7580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff880279cc7600: fc fc fc fc fc fc fc fc fc fc fc 00 00 00 00 00
>ffff880279cc7680: 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc
                                                 ^
 ffff880279cc7700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff880279cc7780: fc fc fc fc fc fc fc fc fc fc 00 00 00 00 00 00
==================================================================

gavin@rotom:~/ddebs/ddebs-3.13.0-76.120hf00073670v20160120b0h5d3e6ab$ addr2line 0xffffffff81205ee3 -e usr/lib/debug/boot/vmlinux-3.13.0-76-generic -fi
constant_test_bit
/home/gavin/ubuntu-trusty-amd64/arch/x86/include/asm/bitops.h:313
mapping_balloon
/home/gavin/ubuntu-trusty-amd64/include/linux/pagemap.h:69
__is_movable_balloon_page
/home/gavin/ubuntu-trusty-amd64/include/linux/balloon_compaction.h:131
balloon_page_movable
/home/gavin/ubuntu-trusty-amd64/include/linux/balloon_compaction.h:156
isolate_migratepages_range
/home/gavin/ubuntu-trusty-amd64/mm/compaction.c:554

>8------------------8<
/home/gavin/ubuntu-trusty-amd64/arch/x86/include/asm/bitops.h:313
310 static __always_inline int constant_test_bit(long nr, const volatile unsigned long *addr)
311 {
312 return ((1UL << (nr & (BITS_PER_LONG-1))) &
313 (addr[nr >> _BITOPS_LONG_SHIFT])) != 0;
314 }
>8------------------8<
Related upstream mailing list discussion:
- mm: compaction: buffer overflow in isolate_migratepages_range
  https://lkml.org/lkml/2014/8/9/162
- [PATCH v3 1/4] mm/balloon_compaction: redesign ballooned pages management
  http://www.spinics.net/lists/linux-mm/msg79249.html

[Fix]
- The first patach is the solution commit which moves the PageBalloon
  check to page->_mapcount.
d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management")
- The second one is the patch to remove the isolation check when the
  CONFIG_BALLOON_COMPACTION is not defined.
4d88e6f7d5ff ("mm/balloon_compaction: fix deflation when compaction is disabled")

[Test Case]
Running the following command on the Trusty
kernel(Ubuntu-3.13.0-86.130) with KASan backported. The bug error
messages cannot be observed in the dmesg.
"./parallel-73670.sh -r 2 -k 40"
That means building 40 kernels in the same time with 2 rounds.

Reference:
[1]. http://kernel.ubuntu.com/git/gavinguo/stress-test.git/

Gavin Guo (mimi0213kimo) wrote :

The following 2 patches are the solution to the bug:

4d88e6f7d5ff mm/balloon_compaction: fix deflation when compaction is disabled
d6d86c0a7f8d mm/balloon_compaction: redesign ballooned pages management

Related upstream mailing list discussion:
- mm: compaction: buffer overflow in isolate_migratepages_range
  https://lkml.org/lkml/2014/8/9/162
- [PATCH v3 1/4] mm/balloon_compaction: redesign ballooned pages management
  http://www.spinics.net/lists/linux-mm/msg79249.html

description: updated

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1572562

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Gavin Guo (mimi0213kimo) on 2016-04-20
description: updated
Changed in linux (Ubuntu):
assignee: nobody → Gavin Guo (mimi0213kimo)
Gavin Guo (mimi0213kimo) on 2016-05-16
description: updated
Gavin Guo (mimi0213kimo) on 2016-05-16
description: updated
Gavin Guo (mimi0213kimo) on 2016-05-16
description: updated
Changed in linux (Ubuntu Trusty):
status: New → Fix Committed
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Kamal Mostafa (kamalmostafa) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
Gavin Guo (mimi0213kimo) wrote :

I tried to reproduce the bug with the following commands:

"./parallel-73670.sh -r 2 -k 40" [1]

The error messages cannot be found in the dmesg with the proposed kernel 3.13.0-89.136 with KASan backported[2].

Reference:

[1]. http://kernel.ubuntu.com/git/gavinguo/stress-test.git/
[2]. http://kernel.ubuntu.com/git/gavinguo/ubuntu-trusty-amd64.git/log/?h=sf00073670_compaction313_sru

tags: added: verification-done-trusty
removed: verification-needed-trusty
Gavin Guo (mimi0213kimo) wrote :

stress test on the v3.13.0-89.136 with KASan backported.

Launchpad Janitor (janitor) wrote :
Download full text (4.0 KiB)

This bug was fixed in the package linux - 3.13.0-91.138

---------------
linux (3.13.0-91.138) trusty; urgency=medium

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1595991

  [ Upstream Kernel Changes ]

  * netfilter: x_tables: validate e->target_offset early
    - LP: #1555338
    - CVE-2016-3134
  * netfilter: x_tables: make sure e->next_offset covers remaining blob
    size
    - LP: #1555338
    - CVE-2016-3134
  * netfilter: x_tables: fix unconditional helper
    - LP: #1555338
    - CVE-2016-3134
  * netfilter: x_tables: don't move to non-existent next rule
    - LP: #1595350
  * netfilter: x_tables: validate targets of jumps
    - LP: #1595350
  * netfilter: x_tables: add and use xt_check_entry_offsets
    - LP: #1595350
  * netfilter: x_tables: kill check_entry helper
    - LP: #1595350
  * netfilter: x_tables: assert minimum target size
    - LP: #1595350
  * netfilter: x_tables: add compat version of xt_check_entry_offsets
    - LP: #1595350
  * netfilter: x_tables: check standard target size too
    - LP: #1595350
  * netfilter: x_tables: check for bogus target offset
    - LP: #1595350
  * netfilter: x_tables: validate all offsets and sizes in a rule
    - LP: #1595350
  * netfilter: x_tables: don't reject valid target size on some
    architectures
    - LP: #1595350
  * netfilter: arp_tables: simplify translate_compat_table args
    - LP: #1595350
  * netfilter: ip_tables: simplify translate_compat_table args
    - LP: #1595350
  * netfilter: ip6_tables: simplify translate_compat_table args
    - LP: #1595350
  * netfilter: x_tables: xt_compat_match_from_user doesn't need a retval
    - LP: #1595350
  * netfilter: x_tables: do compat validation via translate_table
    - LP: #1595350
  * netfilter: x_tables: introduce and use xt_copy_counters_from_user
    - LP: #1595350

linux (3.13.0-90.137) trusty; urgency=low

  [ Kamal Mostafa ]

  * Release Tracking Bug
    - LP: #1595693

  [ Serge Hallyn ]

  * SAUCE: add a sysctl to disable unprivileged user namespace unsharing
    - LP: #1555338, #1595350

linux (3.13.0-89.136) trusty; urgency=low

  [ Kamal Mostafa ]

  * Release Tracking Bug
    - LP: #1591315

  [ Kamal Mostafa ]

  * [debian] getabis: Only git add $abidir if running in local repo
    - LP: #1584890
  * [debian] getabis: Fix inconsistent compiler versions check
    - LP: #1584890

  [ Stefan Bader ]

  * SAUCE: powerpc/powernv: Fix incomplete backport of 8117ac6
    - LP: #1589910

  [ Tim Gardner ]

  * [Config] Remove arc4 from nic-modules
    - LP: #1582991

  [ Upstream Kernel Changes ]

  * KVM: x86: move steal time initialization to vcpu entry time
    - LP: #1494350
  * lpfc: Fix premature release of rpi bit in bitmask
    - LP: #1580560
  * lpfc: Correct loss of target discovery after cable swap.
    - LP: #1580560
  * mm/balloon_compaction: redesign ballooned pages management
    - LP: #1572562
  * mm/balloon_compaction: fix deflation when compaction is disabled
    - LP: #1572562
  * bridge: Fix the way to find old local fdb entries in br_fdb_changeaddr
    - LP: #1581585
  * bridge: notify user space after fdb update
    - LP: #1581585
  * ALSA: timer: Fix leak in SNDRV_TIMER_IOCTL_PARAMS
   ...

Read more...

Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
Joseph Salisbury (jsalisbury) wrote :

The commit that fixes this bug, may have introduced bug 1598197

Changed in linux (Ubuntu Trusty):
assignee: nobody → Gavin Guo (mimi0213kimo)
Changed in linux (Ubuntu):
status: Confirmed → Fix Released
importance: Undecided → Medium
Changed in linux (Ubuntu Trusty):
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments