(In reply to comment #20) > What isn't clear yet is the root cause. No matter what, > we both agree the --rebuilddb leads to better performance. > > But degradation is my primary concern. If Berkeley DB > degrades over time, then a "fix" should be attempted. Well I've no idea how DB works here with my ext3 filesystem. My system partitions is around ~8GB and usually has around 0.7G free space. But as you like plain numbers and I like my simple 'wall' clock experiments, lets dig a bit deeper here :). I've 1GB 'play' partition for experiments (usually for lvm :)). Thus completely fresh format of ext2/ext3/ext4 fs was used for the following test. `hdparm -t` gives 30MB/s for this test partition. using nongfragmented 80MB file 1. pread() ~7.5s 0.06s 2. mmap() ~5.8s 0.10s 3. mmap() ADV ~2.8s 0.07s 4 `cat` ~2.8s 0.06s Timing was nearly same for all extX. (Note - `cat` is slighly faster as it is not doing memcpy - thus mmap with just reading the page has the same or better speed - I may provide updated source file to attachment if needed) So obviously pread() is by far the worst way for this jobs. Let's continue with experiments. Weird fragmentated DB's file /var/lib/rpm/Packages on my system drive results in actually pretty slow read speed - as plain `cp` of this Packages to Packages.copy file reveals this: (`hdparm -t` of this system drive gives 43MB/s) original uncached `cat` of Packages 3.7s copied uncached `cat` of Packages.copy - 2.0s - wow 55% faster. So all this could show some possible updates in code. 1. pread() could be replaced with mmap() - probably pretty easy change - I think it might be optional - i.e. `rpm --use-mmap` - if there would be no problem and user would be happy - it might be switched as default. MMap should probably also result in significantly smaller memory foot print of the application. 2. from time to time probably full copy of Packages file should be done - to defragment strange file layout in filesystem - note that this Package files is just 1day old from fresh rebuilddb and only few packages were modified. 3. hmm just wondering if using plain ASCII small files could get any worse then this Berkeley DB. (i.e. /var/lib/dpkg/info way) 4. Wait till all users switch to SSD - and apply only 1.) to safe memory ;) > > BTW, tnx for the block trace. I can almost see some patterns > that I can map into Berkeley DB and RPM code. > > FYI, you need to control for whether "rpm -qa" is run as root (or not). > If run as root, then a dbenv is opened, and there is a memory pool > cache that is interposed. Your programs lists blocks solely > for Packages afaict. There should be additional I/O occurring to the > memory pool as well. Yep - 4 appearances of pread() seems to be from different file descriptor in my strace. But they would have probably very small impact on the total time. Just another simple strace check of pread() appearance for simple small rpm file installation - it looks like there are some ~10000 pread() calls as well - which is slightly less than for -qa - but still pretty high number. > > With a sequential access like "rpm -qa", cache blow out is inevitable. rpm -i seems to be doing not so much different job after all... > There is an issue of rpmdb fragmentation on ext4 reported by sandeen. Yep - revealed by my plain simple experiment as well and I'm running ext3 > In empty chroot's, an rpmdb is the most fragmented > Whether fallocate (or equivalent) could/should be used to address > rpmdb fragmentation is not yet clear. Certainly fallocate would reduce I think large DB files should be probably split to smaller pieces according to their usage. Thus for the most common task only small number of data would need to be loaded. For some hardly every used commands like rpm -q --changelog (which I'm still wondering why they are part of DB and not stored somewhere in /usr/share/doc/pkg/changelog file) could be loaded from a separate DB. Anyway take this only just as my ideas - nothing you should probably worry about. Maybe there is a way how to improve Berkeley DB to handle all of this still in one file...