(In reply to comment #5) > Now you're talking ... > > 1) No one (certainly not me) said your reported results were invalid. I don't > think > you've analyzed the results correctly (see your callgrind measurements, > dataLength > is not I/O related per se). And its premature to advertise a performance "fix" > like running > --rebuilddb periodically when the performance problem is poorly characterized > and understood. I've not done any --rebuilddb analysis - I've just wrote: rpm -qa time before rebuild was 47 seconds - after rebuild 12 seconds. I've not saved older dataset for this analysis as I've not expected any problems. So it's just the fact that speed of my machine with --rebuildb improved ~4 times. > > 2) I did not say your metric was "invalid", read what I wrote. I have tried the > results myself. > I run callgrind on rpm at least weekly and already know (and have fixed) many > performance > problems in RPM. I did point out that there are other issues than I/O, and > suggested callgrind, > where I/O overhead is _NOT_ the issue that you have measured. I did point out > that you hav > another level of caching that needs to be controlled for useful I/O metrics. Do you flush disk buffers within your tests ? Time when all data are buffered in memory is 'almost' acceptable (though there are still some reserves - but there might be limits from DB format, which is probably nontrivial to change) My report is mainly about the moment when there are no data in memory - thus trivial query for installed package takes 12 seconds. > 3) If "sleeps and waits" is the issue (its not afaik), that was not at all > clear from your > wall clock benchmarks. And I most definitely know both rpm and db4 sources, in > fact I have > achieved a measured (w callgrind) 14.6x performance increase @rpm5.org by > running careful > (better than wallclock) benchmarks. But that's not relevant here. Is this rpm 4.7 going to be replaced by rpm 5 - or is it unrelated project to Fedora's rpm package? > 4) Stare at the numero uno piggy in the callgrind spewage. When you start to > realize that > serialization and marshalling is the issue, then you will begin to understand > the > performance issue. As I've said - callgrind will not show I/O stalls. > 5) I'm not sure how SHA512 is related other than through signatures, where > --nosignature > is the disabler. In all cases, verifying digests on header blob's is overhead > unrelated > to I/O performance and must be controlled for. Sure it's not related to slow disk reading - it's just what callgrind shows - and I've been just curious how much memory chunks needs to be checksummed for every simple rpm command - maybe it might be effective to use a short term daemon, to speed up repeated invocation (if daemon keeps lock on database) > 6) yum performance depends on many factors unrelated to rpm. But run benchmarks > on yum if you wish to understand yum performance problems. Without > measurements, > feel free to claim anything you wish about the cause of yum's pathetic > performance, > your opinion is as good or bad as anyone else's. Yeah - sure python is much bigger CPU eater in this case - but rpm is not negligible either...