iostat: Cannot open /proc/stat: Cannot allocate memory

Bug #1319244 reported by Stephane Bakhos on 2014-05-14
26
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Stefan Bader
Trusty
Medium
Unassigned

Bug Description

break-fix: 59a32e2ce5eb809967cac4e718bc527beca83c59 f74373a5cc7a0155d232c4e999648c7a95435bb2
break-fix: 59a32e2ce5eb809967cac4e718bc527beca83c59 058504edd02667eef8fac9be27ab3ea74332e9b4

Intermitently I get errors stating that various programs cannot allocate memory when reading /proc/stat

I have 2 servers that are affected by the issue.
Probably a duplicate of 1226172

[1764593.455329] top: page allocation failure: order:4, mode:0x1040d0
[1764593.455335] CPU: 3 PID: 10323 Comm: top Tainted: GF O 3.11.0-18-generic #32-Ubuntu
[1764593.455338] Hardware name: Intel Corporation S2400SC/S2400SC, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
[1764593.455340] 0000000000000000 ffff88042670b9d0 ffffffff816e8571 00000000001040d0
[1764593.455345] ffff88042670ba58 ffffffff81144ae0 00000000ffffffff ffff880a3fffcb48
[1764593.455348] 000000000000000e ffff88042670ba28 ffffffff00000040 0000000000000007
[1764593.455351] Call Trace:
[1764593.455361] [<ffffffff816e8571>] dump_stack+0x45/0x56
[1764593.455367] [<ffffffff81144ae0>] warn_alloc_failed+0xf0/0x140
[1764593.455371] [<ffffffff81148a15>] __alloc_pages_nodemask+0x725/0x920
[1764593.455376] [<ffffffff81184589>] alloc_pages_current+0xa9/0x160
[1764593.455379] [<ffffffff81143bfe>] __get_free_pages+0xe/0x50
[1764593.455384] [<ffffffff8118d75e>] kmalloc_order_trace+0x2e/0xa0
[1764593.455388] [<ffffffff811c9960>] ? single_open+0xb0/0xb0
[1764593.455391] [<ffffffff8118ee19>] __kmalloc+0x159/0x180
[1764593.455393] [<ffffffff811c9960>] ? single_open+0xb0/0xb0
[1764593.455398] [<ffffffff81213c10>] stat_open+0x50/0xc0
[1764593.455401] [<ffffffff8120a700>] proc_reg_open+0x70/0x110
[1764593.455405] [<ffffffff811a5d4b>] do_dentry_open+0x1eb/0x280
[1764593.455408] [<ffffffff8120a690>] ? proc_reg_release+0x70/0x70
[1764593.455411] [<ffffffff811a5e69>] vfs_open+0x49/0x50
[1764593.455415] [<ffffffff811b5e13>] do_last+0x1c3/0xe50
[1764593.455418] [<ffffffff8118e5d8>] ? kmem_cache_alloc_trace+0x108/0x130
[1764593.455421] [<ffffffff811b6b61>] path_openat+0xc1/0x630
[1764593.455426] [<ffffffff811689d9>] ? handle_mm_fault+0x299/0x670
[1764593.455429] [<ffffffff811b7eda>] do_filp_open+0x3a/0x90
[1764593.455433] [<ffffffff816ef78e>] ? _raw_spin_lock+0xe/0x20
[1764593.455437] [<ffffffff811c4847>] ? __alloc_fd+0xa7/0x130
[1764593.455440] [<ffffffff811a704c>] do_sys_open+0x12c/0x270
[1764593.455445] [<ffffffff81074989>] ? __set_current_blocked+0x49/0x60
[1764593.455448] [<ffffffff811a71ae>] SyS_open+0x1e/0x20
[1764593.455452] [<ffffffff816f841d>] system_call_fastpath+0x1a/0x1f
[1764593.455454] Mem-Info:
[1764593.455455] Node 0 DMA per-cpu:
[1764593.455458] CPU 0: hi: 0, btch: 1 usd: 0
[1764593.455459] CPU 1: hi: 0, btch: 1 usd: 0
[1764593.455461] CPU 2: hi: 0, btch: 1 usd: 0
[1764593.455462] CPU 3: hi: 0, btch: 1 usd: 0
[1764593.455463] CPU 4: hi: 0, btch: 1 usd: 0
[1764593.455465] CPU 5: hi: 0, btch: 1 usd: 0
[1764593.455466] CPU 6: hi: 0, btch: 1 usd: 0
[1764593.455468] CPU 7: hi: 0, btch: 1 usd: 0
[1764593.455469] Node 0 DMA32 per-cpu:
[1764593.455471] CPU 0: hi: 186, btch: 31 usd: 0
[1764593.455472] CPU 1: hi: 186, btch: 31 usd: 0
[1764593.455474] CPU 2: hi: 186, btch: 31 usd: 0
[1764593.455475] CPU 3: hi: 186, btch: 31 usd: 0
[1764593.455477] CPU 4: hi: 186, btch: 31 usd: 0
[1764593.455478] CPU 5: hi: 186, btch: 31 usd: 0
[1764593.455480] CPU 6: hi: 186, btch: 31 usd: 0
[1764593.455481] CPU 7: hi: 186, btch: 31 usd: 0
[1764593.455482] Node 0 Normal per-cpu:
[1764593.455484] CPU 0: hi: 186, btch: 31 usd: 0
[1764593.455485] CPU 1: hi: 186, btch: 31 usd: 0
[1764593.455487] CPU 2: hi: 186, btch: 31 usd: 0
[1764593.455488] CPU 3: hi: 186, btch: 31 usd: 0
[1764593.455490] CPU 4: hi: 186, btch: 31 usd: 0
[1764593.455491] CPU 5: hi: 186, btch: 31 usd: 0
[1764593.455492] CPU 6: hi: 186, btch: 31 usd: 0
[1764593.455494] CPU 7: hi: 186, btch: 31 usd: 0
[1764593.455495] Node 1 Normal per-cpu:
[1764593.455497] CPU 0: hi: 186, btch: 31 usd: 0
[1764593.455498] CPU 1: hi: 186, btch: 31 usd: 0
[1764593.455500] CPU 2: hi: 186, btch: 31 usd: 0
[1764593.455501] CPU 3: hi: 186, btch: 31 usd: 0
[1764593.455503] CPU 4: hi: 186, btch: 31 usd: 0
[1764593.455504] CPU 5: hi: 186, btch: 31 usd: 30
[1764593.455505] CPU 6: hi: 186, btch: 31 usd: 0
[1764593.455507] CPU 7: hi: 186, btch: 31 usd: 0
[1764593.455511] active_anon:4739030 inactive_anon:2482840 isolated_anon:0
[1764593.455511] active_file:6228818 inactive_file:5945293 isolated_file:0
[1764593.455511] unevictable:1459 dirty:61 writeback:0 unstable:0
[1764593.455511] free:318692 slab_reclaimable:462577 slab_unreclaimable:149699
[1764593.455511] mapped:25851 shmem:2934 pagetables:27313 bounce:0
[1764593.455511] free_cma:0
[1764593.455514] Node 0 DMA free:15848kB min:16kB low:20kB high:24kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15932kB managed:15848kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[1764593.455519] lowmem_reserve[]: 0 2897 40132 40132
[1764593.455523] Node 0 DMA32 free:505356kB min:3244kB low:4052kB high:4864kB active_anon:682760kB inactive_anon:959288kB active_file:336884kB inactive_file:205164kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3049268kB managed:2970684kB mlocked:0kB dirty:0kB writeback:0kB mapped:2720kB shmem:8kB slab_reclaimable:261396kB slab_unreclaimable:16056kB kernel_stack:120kB pagetables:952kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[1764593.455527] lowmem_reserve[]: 0 0 37234 37234
[1764593.455530] Node 0 Normal free:615816kB min:41704kB low:52128kB high:62556kB active_anon:11679740kB inactive_anon:6085128kB active_file:9771528kB inactive_file:8399428kB unevictable:3380kB isolated(anon):0kB isolated(file):0kB present:38797312kB managed:38127692kB mlocked:3380kB dirty:192kB writeback:0kB mapped:29824kB shmem:5620kB slab_reclaimable:724768kB slab_unreclaimable:205508kB kernel_stack:8296kB pagetables:66784kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[1764593.455535] lowmem_reserve[]: 0 0 0 0
[1764593.455538] Node 1 Normal free:137748kB min:45144kB low:56428kB high:67716kB active_anon:6593620kB inactive_anon:2886944kB active_file:14806860kB inactive_file:15176580kB unevictable:2456kB isolated(anon):0kB isolated(file):0kB present:41943040kB managed:41273172kB mlocked:2456kB dirty:52kB writeback:0kB mapped:70860kB shmem:6108kB slab_reclaimable:864144kB slab_unreclaimable:377232kB kernel_stack:2912kB pagetables:41516kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[1764593.455543] lowmem_reserve[]: 0 0 0 0
[1764593.455545] Node 0 DMA: 0*4kB 1*8kB (U) 0*16kB 1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15848kB
[1764593.455557] Node 0 DMA32: 70707*4kB (UEMR) 17882*8kB (UEMR) 4937*16kB (UEMR) 13*32kB (UEMR) 2*64kB (R) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 505420kB
[1764593.455567] Node 0 Normal: 137539*4kB (UEM) 7098*8kB (UEM) 252*16kB (UEMR) 23*32kB (UMR) 10*64kB (UMR) 13*128kB (UER) 5*256kB (UER) 1*512kB (R) 1*1024kB (R) 0*2048kB 0*4096kB = 616828kB
[1764593.455578] Node 1 Normal: 33118*4kB (UEM) 505*8kB (UEM) 39*16kB (UEM) 27*32kB (UEM) 7*64kB (M) 2*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 138704kB
[1764593.455589] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[1764593.455591] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[1764593.455592] 13157020 total pagecache pages
[1764593.455593] 979097 pages in swap cache
[1764593.455595] Swap cache stats: add 13258848, delete 12279729, find 15382022/16560210
[1764593.455596] Free swap = 45316860kB
[1764593.455597] Total swap = 56722668kB
[1764593.597232] wcaxx 0000:08:00.0: Underrun detected by hardware. Latency at max of 20ms.
[1764593.766212] wcaxx 0000:08:00.0: Underrun detected by hardware. Latency at max of 20ms.
[1764593.766286] 20971519 pages RAM
[1764593.766287] 374670 pages reserved
[1764593.766289] 10669900 pages shared
[1764593.766290] 11731430 pages non-shared

ProblemType: Bug
DistroRelease: Ubuntu 13.10
Package: linux-image 3.11.0.20.21
ProcVersionSignature: Ubuntu 3.11.0-18.32-generic 3.11.10.4
Uname: Linux 3.11.0-18-generic x86_64
ApportVersion: 2.12.5-0ubuntu2.2
Architecture: amd64
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory: 'iw'
Date: Wed May 14 00:12:36 2014
HibernationDevice: RESUME=UUID=d678f2d9-0912-4fe6-97ca-e1685a5b5c8e
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
MachineType: Intel Corporation S2400SC
MarkForUpload: True
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.11.0-18-generic root=UUID=0e0470c8-fe13-4b7f-ace7-e59d91e04b07 ro
RelatedPackageVersions:
 linux-restricted-modules-3.11.0-18-generic N/A
 linux-backports-modules-3.11.0-18-generic N/A
 linux-firmware 1.116.2
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: Upgraded to saucy on 2013-11-12 (182 days ago)
dmi.bios.date: 12/23/2013
dmi.bios.vendor: Intel Corp.
dmi.bios.version: SE5C600.86B.02.02.0002.122320131210
dmi.board.asset.tag: ....................
dmi.board.name: S2400SC
dmi.board.vendor: Intel Corporation
dmi.board.version: G18552-403
dmi.chassis.asset.tag: ....................
dmi.chassis.type: 23
dmi.chassis.vendor: ...............................
dmi.chassis.version: p4308cpmhen
dmi.modalias: dmi:bvnIntelCorp.:bvrSE5C600.86B.02.02.0002.122320131210:bd12/23/2013:svnIntelCorporation:pnS2400SC:pvr....................:rvnIntelCorporation:rnS2400SC:rvrG18552-403:cvn...............................:ct23:cvrp4308cpmhen:
dmi.product.name: S2400SC
dmi.product.version: ....................
dmi.sys.vendor: Intel Corporation

Stephane Bakhos (sbakhos) wrote :

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Stefan Bader (smb) wrote :

It looks like two things at play:
1. Something in the way the system is set up seems to the DMA/DMA32 pool under higher pressure
2. Reading /proc/stat ends up requesting a 64kB memory area (order-4) in the DMA32 pool on node0

While there are still two elements in the 64kB slot, they both are emergency reserve, which open_stat did not ask for. So while there is not much that can easily be done about the memory pressure in general, it looks like open_stat could very well be more conservative about its memory requirement. First the calculation reserves 128 bytes for each possible cpu, while the actual output only produces one line of overall stats, plus one for each online cpu. So it seems to make sense to use the number of online cpus for the allocation. Second the upper limit for the buffer is set to the maximum that kmalloc can do. However it is known that anything beyond order-3 allocations (32kB) are costly and prone to fail due to memory pressure. And having a smaller buffer is not world ending for the stat output either. Since the proc file uses the seq_file functions it only would cause more read calls instead of filling all stats into one single buffer.

Changed in linux (Ubuntu):
importance: Undecided → Medium
assignee: nobody → Stefan Bader (smb)
Stefan Bader (smb) wrote :

This would be the proposed way of fixing the issue. I took a current 3.11 kernel version with the patch applied and prepared packages which can be found at http://people.canonical.com/~smb/lp1319244/
Please could you try how this works on your system and report back here with the results? Thanks.

Stephane Bakhos (sbakhos) wrote :

I've installed the kernel on one of the servers that is affected. It might take a few days before there are any results though.

Stephane Bakhos (sbakhos) wrote :

On a freshish reboot (42 minutes) with some of the qemu VMs started

root@lg-host2:~# cat /proc/buddyinfo
Node 0, zone DMA 0 0 0 1 1 1 1 0 1 1 3
Node 0, zone DMA32 6 4 9 7 7 7 5 3 4 1 702
Node 0, zone Normal 47259 43506 37987 31332 23517 15376 8779 4479 2145 0 0
Node 1, zone Normal 377 112 54 335 86 19 21 8 3 12 4008

tags: added: patch
Stephane Bakhos (sbakhos) wrote :

The main use of the system is as a QEMU/KVM host, with currently 14 VMs running and some other processes.

Stefan Bader (smb) wrote :

Any news about how the test kernels did fare?

Stephane Bakhos (sbakhos) wrote :

I don't think we've had enough uptime with it as the buddyinfo still looks like the bug wouldn't trigger.

May 22:
cat /proc/buddyinfo
Node 0, zone DMA 0 0 0 1 1 1 1 0 1 1 3
Node 0, zone DMA32 84335 20475 5469 995 125 78 23 2 1 1 0
Node 0, zone Normal 116375 4708 229 27 41 12 13 2 0 0 0
Node 1, zone Normal 185773 42193 5699 1997 1017 510 235 85 24 0 0

May 26 (uptime 11 days)
# cat /proc/buddyinfo
Node 0, zone DMA 0 0 0 1 1 1 1 0 1 1 3
Node 0, zone DMA32 81329 25081 6593 1272 210 13 0 0 0 0 0
Node 0, zone Normal 334689 3247 3 0 0 0 0 0 0 0 0
Node 1, zone Normal 84556 109611 7577 1366 381 127 55 6 2 0 0

Slowly getting there...

For information this is the buddyinfo on an unpatched server with the same hardware (32 day uptime)
Node 0, zone DMA 0 1 0 1 1 1 1 0 1 1 3
Node 0, zone DMA32 60621 32499 9670 10 0 0 1 0 0 0 0
Node 0, zone Normal 11045 7738 471 33 10 10 0 0 0 0 0
Node 1, zone Normal 113751 37130 5882 202 12 0 0 0 0 0 0

Stephane Bakhos (sbakhos) wrote :

So far after 21 days of uptime the issue hasn't come back.

# cat /proc/buddyinfo
Node 0, zone DMA 0 0 0 1 1 1 1 0 1 1 3
Node 0, zone DMA32 9544 583 7936 174 3 1 1 0 0 0 0
Node 0, zone Normal 36669 197 1 0 0 0 0 0 0 0 0
Node 1, zone Normal 103431 13022 845 105 20 4 2 1 1 0 0

Unfortunately I'll have to bring the computer down tomorrow morning as there is a planned power outage.
I'll setup the patched kernel on the 2nd computer that is affected by the issue.

Stephane Bakhos (sbakhos) wrote :

Alright the test kernel clearly fixed the issue.

My gut tells me that there might be some memory leak also at play with the non patched version, but I don't know kernel code well enough to analyze that.

Stefan Bader (smb) wrote :

It turns out we (you) were not the first ones running into it. The IBM mainframe guys (obviously) have the same need to potentially allow many CPUs. The upstream discussion is here: https://lkml.org/lkml/2014/5/21/251

So, while the change I made does work in this case, it can cause issues. Firstly /proc/stat uses a open_single interface. That means the proc file is not seekable and has no iterator. So all output has to be done in one go. And that means the buffer _has_ to be big enough to hold all output. Which means the part that limits the maximum size to order-3 allocations can cause issue on systems that really have a lot of CPUs.
The other part of changing the buffer from holding x*possible number of CPUs to x*online CPUs was also thought of in the upstream thread. The problem there is that between open (buffer gets allocated) and show (buffer is filled) it is possible than a bunch of CPUs are set online. So one needs to be extra careful what would happen then.

Sounds like the most likely outcome could be to have a vmalloc fallback option. But the discussion does not seem to be finished, yet.

Stefan Bader (smb) on 2014-07-11
description: updated
tags: added: kernel-bug-break-fix
Andy Whitcroft (apw) on 2014-07-14
Changed in linux (Ubuntu):
status: Confirmed → Fix Released
tags: removed: kernel-bug-break-fix
Stephane Bakhos (sbakhos) wrote :

Will the patch be integrated in Saucy Salamander?

Stefan Bader (smb) wrote :

I would doubt Saucy by now since it is almost EOL due to the reduced support of non-LTS releases. Trusty should get it through the normal upstream-stable path.

Stefan Bader (smb) on 2014-07-14
Changed in linux (Ubuntu Trusty):
importance: Undecided → Medium
status: New → Triaged
tags: added: kernel-bug-break-fix
Andy Whitcroft (apw) on 2014-07-14
Changed in linux (Ubuntu Trusty):
status: Triaged → Confirmed
Andy Whitcroft (apw) on 2014-08-15
Changed in linux (Ubuntu Trusty):
status: Confirmed → Fix Committed
Andy Whitcroft (apw) on 2014-08-29
Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
tags: removed: kernel-bug-break-fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers