strange malloc interaction with sysctl vm.overcommit_memory=2

Bug #345601 reported by Smoot
2
Affects Status Importance Assigned to Milestone
glibc (Ubuntu)
New
Undecided
Unassigned

Bug Description

Binary package hint: libc6

Pertinent system information is:

smoot@smoot:~/tmp$ lsb_release -rd
Description: Ubuntu 8.10
Release: 8.10
smoot@smoot:~/tmp$ uname -a
Linux smoot 2.6.27-11-generic #1 SMP Thu Jan 29 19:28:32 UTC 2009 x86_64 GNU/Linux

It appears that malloc is inconsistent in its behavior when vm.overcommit_memory is set to 2. According to the kernel documentation when this kernel flag is set, memory allocation should be restricted to the amount of swap plus an overcommit precentage of available memory. I used a small C program available from another contributor to test.

#include <stdio.h>
#include <stdlib.h>

int main (void) {
        int n = 0;

        size_t size = 0x100000; /* e.g. 1MiB */

        while (1) {
                if (malloc(size) == NULL) {
                        printf("malloc failure after %d MiB\n", n);
                        return 0;
                }
                printf ("got %d MiB %x\n", ++n, sbrk(0));
        }
}

This will allocate additional memory in 1MiB chunks. Even with overcommit_memory=2 and overcommit_ratio=50, a terabyte of virtual memory was allocated to a single process on a system with 4GiB of RAM and 4GiB of swap. However, changing size = 0x20000000 (1 GiB) as the allocation chunk, the program stops after allocating up to the amount of free swap plus the overcommited portion of free memory. This output looked like:

got 1 GiB 17ea000
got 2 GiB 17ea000
got 3 GiB 17ea000
got 4 GiB 17ea000
malloc failure after 4 GiB

I also changed the label strings on the printf statement in the program for clarity.

It appears that for large chunks, the kernel honors the overcommit_memory, but for smaller chunks, it ignores the overcommit flags. As an additional sanity test I performed the same test using sbrk() to allocate more space. This program is:

#include <stdio.h>
#include <stdlib.h>

int main (void) {
        int n = 0;

        size_t size = 0x100000;
        printf("%d\n", size);

        while (1) {
                if (sbrk(size) == -1) {
                        printf("sbrk failure after %d MiB\n", n);
                        return 0;
                }
                printf ("got %d %x MiB\n", ++n, sbrk(0));
        }
}

This program properly failed to allocate more VM at the appropriate point. The tail of the output was:

got 4361 11285000 MiB
got 4362 11385000 MiB
got 4363 11485000 MiB
got 4364 11585000 MiB
got 4365 11685000 MiB
got 4366 11785000 MiB
sbrk failure after 4366 MiB

I then looked at the current implementation of malloc() and confirmed it used both sbrk() and mmap() for VM allocation. mmap appears to be used for large chunks of VM. I also did an strace on the program which ignored the allocation limit. The relevant system call output looked like this:

mmap(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(0x7f4f64000000, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f4f64000000
mprotect(0x7f4f64000000, 1183744, PROT_READ|PROT_WRITE) = 0
write(1, "got 37889 MiB\n", 14) = 14

I do not understand all of the malloc() algorithm, but it appears with large chunks where the allocation fails using mmap, it falls back on allocation of a large block of VM with a call to mmap with the MAP_NORESERVE flag. This flag tells the kernel not to reserve swap space to backup the block of VM. I think this is the core of the problem reported above. I recompiled malloc() with that flag removed and the correct behavior appeared for the tested allocation chunk sizes.

I think MAP_NORESERVE needs to be removed from the mmap calls used in malloc(), so the overcommit_memory flag is respected when it is set to 2.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.