RPM

Comment 4 for bug 635834

Revision history for this message
In , Zdenek (zdenek-redhat-bugs) wrote :

Let's go into details:

1.) I've mentioned --rebuilddb purely for the reason that rpm over time get much much slower - I've not measured time of --rebuilddb. Obviously either db4 or rpm is seriously fragmenting data over months of usage and gets slower. Is it documented I should run 'rebuildb' regularly to get decent performance ??

2.) My metric is perfectly valid and ideally you should try that yourself. I think if there is more the 100% speed up just by using cat - then either it should be configurable option for rpm (so it would do this itself - i.e. simple wrapper script could handle that) or the rpm read capabilities should be enhanced - 1st option is probably more simpler to implement, 2nd choise is preferable longterm solution.

Also I believe you are probably slightly missing what I want to point out in this bugzilla.

I've never read anything about tunning libdb for improving rpm performance and I'm not quite sure why should I - why isn't the default for rpm to go as fast as possible - why the user should tune libdb to get decent rpm speed?

3.) My "overhead" term here does not mean rpm 'burns' CPU - but actually nearly sleeps and waits all the time - because of some were low performing file access pattern - it actually remainds me usage of mmap without readahead - but that's just wild guess - I do not know rpm sources (nor libdb4)

4.) I'll attach callgrind - but I'm not really sure it helps here - I may provide probably output of oprofile, perf, or other tools you would like - I could even time track disk access though this will be probably quite long trace if you want. But IMHO why don't you try yourself - I've observed same behavior on many RHEL and Fedora machine around me - so it's definitely not the problem of my personal laptop.

5.) Just a side note - when using rpm without --nodigest options - it reveals very high load from SHA515 invocation - so again my simple 'dumb' time test reveals:

sha512sum /var/lib/rpm/Packages: - 0.46s (when cached)
cat >/dev/null 0.04s (when cached)

but rpm with digesting takes 1.1s
without digest (--nodigest --nosignature) 0.3s

This gives 0.8s difference - it looks like sha512sum would checksum nearly twice my 80MB Package file within this period of time - so again I may ask a simple question - what is calculated so heavily inside rpm nss-softokn library ?

6.) --nodigest reveals a lot of time being spent rpm-4.7.1/lib/header.c:dataLength - just by a plain look there it appears to me, that rpm actually spends major time for scanning strings inside binary file - why not store string size within the string - or use some indexes for this ?

I'm not sure if that's the reason why the yum is so slow - but I think it's part of the puzzle....

If you need more details let me know.