Comment 6 for bug 1636847

Revision history for this message
Seth Forshee (sforshee) wrote :

@zyga: I'm honestly very surprised that the config change had that drastic an impact on the single-CPU system. Can you tell me what 'cat /sys/devices/system/cpu/possible' says on that system?

Can I assume that the "size-1M" implies a 1MB block size?

Let's start with the simplest options for reducing RAM usage. The most obvious place for gains is the buffers squashfs uses for decompression and caching. squashfs has three caches:

- The "metadata" cache caches 8 blocks of metadata. The metadata block size is fixed at 8KB and the cache is fixed at 8 blocks, so this consumes 64KB of RAM (plus overhead). So there isn't a lot to be gained here.

- The "data" cache is the one affected by CONFIG_SQUASHFS_DECOMP_SINGLE. squashfs allocates RAM up front for each possible decompression thread, at fs_block_size bytes per thread. The previously used config option allocated one cache per possible system in the CPU (which is why I was suprised at the numbers for the single CPU system; at a 1MB block size that implies the system supports over 100 CPUs). The only simple way to gain more here would be to reduce the block size of the filesystems.

- The "fragment" cache. This caches CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE blocks, which is currently set to 3 in our kernels. So it would seem that this accounts for the bulk of the remaining RAM usage. That means for a 1MB block size it's a fixed size of 3MB. We could reduce this to 1 to save some RAM, and reducing the block size of the squashfs images would again help here.

So if the images do have a 1MB block size there are two simple things that will yield the biggest gains - reducing the fragment cache to 1 block and reducing the block size of the squashfs images. Obviously any reduction in cache sizes may result in a loss of performance, depending on access patterns.

I'll go ahead and build a kernel with CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE=1 for you to test. I would suggest taking a look at performance in addition to RAM usage when you're testing.

If 1MB is the block size you're using, would you be open to making this smaller?