Ubuntu
linux-ec2 package

"BUG: Bad page state in process" when running on EC2

Bug #1052275 reported by Matt Wilson on 2012-09-18

This bug affects 6 people

Affects		Status	Importance	Assigned to	Milestone
	linux-ec2 (Ubuntu)	Confirmed	Undecided	Unassigned

Bug Description

After running for some time, several m1.large 64-bit instances started repeatedly hitting this BUG_ON()

[525758.322281] BUG: Bad page state in process pdnsd pfn:1d1a6f
[525758.322290] page:ffff88000b26f848 flags:800000000000087c count:2 mapcount:0 mapping:ffff8800d2da0860 index:99
[525758.322294] Pid: 731, comm: pdnsd Not tainted 2.6.32-346-ec2 #51-Ubuntu
[525758.322296] Call Trace:
[525758.322305] [<ffffffff810b39c0>] bad_page+0xd0/0x130
[525758.322307] [<ffffffff810b48aa>] prep_new_page+0x1aa/0x1c0
[525758.322310] [<ffffffff810b3d75>] ? zone_watermark_ok+0x25/0xe0
[525758.322312] [<ffffffff810b4a2b>] get_page_from_freelist+0x16b/0x550
[525758.322315] [<ffffffff810b5586>] __alloc_pages_nodemask+0xd6/0x180
[525758.322319] [<ffffffff810cc37d>] do_anonymous_page+0x21d/0x540
[525758.322321] [<ffffffff810cee87>] handle_mm_fault+0x427/0x4f0
[525758.322333] [<ffffffff814b5fe7>] do_page_fault+0x147/0x390
[525758.322335] [<ffffffff814b3d28>] page_fault+0x28/0x30

One instance ultimately hit a GPF:
[525758.336588] general protection fault: 0000 [#1] SMP
[525758.336598] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
[525758.336601] CPU 1
[525758.336603] Modules linked in: ipv6 raid0 md_mod
[525758.336610] Pid: 731, comm: pdnsd Tainted: G B 2.6.32-346-ec2 #51-Ubuntu
[525758.336613] RIP: e030:[<ffffffff810b4ab6>] [<ffffffff810b4ab6>] get_page_from_freelist+0x1f6/0x550
[525758.336623] RSP: e02b:ffff8801dce4bce8 EFLAGS: 00010096
[525758.336625] RAX: ffffffff816b1570 RBX: ffffffff816b1480 RCX: 0000000000000040
[525758.336628] RDX: dead000000100100 RSI: 0000000000000000 RDI: 0000000000000005
[525758.336630] RBP: ffff8801dce4bdb8 R08: 0000000000010ffa R09: 0000000000000000
[525758.336633] R10: 0000000000000005 R11: 0000000000000000 R12: ffff88000b26f848
[525758.336636] R13: 0000000000000001 R14: dead000000200200 R15: 0000000000000002
[525758.336642] FS: 00007f4b928ee700(0000) GS:ffff880002e7e000(0000) knlGS:0000000000000000
[525758.336645] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[525758.336647] CR2: 00007f4b900e9ff8 CR3: 00000001debdf000 CR4: 0000000000002660
[525758.336650] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[525758.336653] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
[525758.336656] Process pdnsd (pid: 731, threadinfo ffff8801dce4a000, task ffff8801dce40300)
[525758.336659] Stack:
[525758.336660] ffff8801dcdaf0c0 00000002dcdaf0c0 0000000000000000 000000000000a3c0
[525758.336665] <0> ffff880100000041 ffff8801dcdaf0c0 ffffffffdce4be28 0000000100000000
[525758.336670] <0> 0000000300000040 0000000000000000 ffffffff816b6088 ffffffff816b34c0
[525758.336677] Call Trace:
[525758.336682] [<ffffffff810b5586>] __alloc_pages_nodemask+0xd6/0x180
[525758.336687] [<ffffffff810cc37d>] do_anonymous_page+0x21d/0x540
[525758.336690] [<ffffffff810cee87>] handle_mm_fault+0x427/0x4f0
[525758.336695] [<ffffffff814b5fe7>] do_page_fault+0x147/0x390
[525758.336698] [<ffffffff814b3d28>] page_fault+0x28/0x30
[525758.336701] Code: 84 b0 00 00 00 4b 8d 44 ef 05 48 c1 e0 04 4c 8b 64 18 08 49 83 ec 28 49 8b 44 24 30 49 8b 54 24 28 49 be 00 02 20 00 00 00 ad de <48> 89 42 08 48 89 10 48 b
8 00 01 10 00 00 00 ad de 49 89 44 24
[525758.336747] RIP [<ffffffff810b4ab6>] get_page_from_freelist+0x1f6/0x550
[525758.336752] RSP <ffff8801dce4bce8>
[525758.336757] ---[ end trace 371c569b99678b87 ]---