To try to understand how the CMA is allocated - and what the problem might be - I booted a HiSilicon D06 w/ plenty of cma cushion (cma=128M), and looked at cma debugfs. In the upstream thread, Robin asked if this was potentially an issue with lots of 1 page allocations - something for which patches have been proposed. Here's a histogram of allocation sizes:
So, there are 2062 1 page allocations. While that's the largest # of allocations by page size, that's only really 13% of the allocated pages, just over 8M. That would get us down to 53M. But we know that total allocation size isn't the only problem - fragmentation must be playing a role, as we're using ~61M, but still seeing errors w/ cma=64M.
To understand the fragmentation impact, I looked at the debugfs cma bitmap. I converted that map to binary, coalescing repeated lines, and got this:
I don't see any signs of fragmentation w/ the single page allocations. What is concerning is the 252 33 page allocations (a subset of the 256 in the histogram). They are getting allocated on 64M boundaries, which ends up leaving 31 free pages per allocation. 256 * 31 * 4K = 31M of potentially wasted space.
To try to understand how the CMA is allocated - and what the problem might be - I booted a HiSilicon D06 w/ plenty of cma cushion (cma=128M), and looked at cma debugfs. In the upstream thread, Robin asked if this was potentially an issue with lots of 1 page allocations - something for which patches have been proposed. Here's a histogram of allocation sizes:
$ dmesg | grep "cma: cma_alloc(cma" | sed -r 's/.*count
([0-9]+)\,.*/\1/' | sort -n | uniq -c
2062 1
32 2
266 8
2 24
4 32
256 33
7 64
2 128
2 1024
So, there are 2062 1 page allocations. While that's the largest # of allocations by page size, that's only really 13% of the allocated pages, just over 8M. That would get us down to 53M. But we know that total allocation size isn't the only problem - fragmentation must be playing a role, as we're using ~61M, but still seeing errors w/ cma=64M.
To understand the fragmentation impact, I looked at the debugfs cma bitmap. I converted that map to binary, coalescing repeated lines, and got this:
111111111111111 111111111111111 11 x309 111111111111111 11\__ x252 111111111111111 11 111111111111111 11 x48
1111111111111111
111111111111111
1 /
111111111111111
0
111111111111111
0 x412
I don't see any signs of fragmentation w/ the single page allocations. What is concerning is the 252 33 page allocations (a subset of the 256 in the histogram). They are getting allocated on 64M boundaries, which ends up leaving 31 free pages per allocation. 256 * 31 * 4K = 31M of potentially wasted space.