combining ROOT files in fixed-order mode
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MadGraph5_aMC@NLO |
In Progress
|
Undecided
|
Olivier Mattelaer |
Bug Description
Dear authors,
what I am going to report is not a bug but maybe a feature which can be improved.
I am generating a process at NLO such that the last calculation is splitted into about 1500 jobs, and I am filling about 50000 histograms such that each resulting ROOT file has size of 30 MB. In the end of run, combining them in the standard way via combine_root.C has following difficulties:
(1) it takes about 12 hours and uses about 100 GB of RAM, so it is pretty not obvious to succeed
(2) it even does not work out-of-the-box because of the hardcoded limit of 1000 max. files in combine_root.C (of course it was easy to change).
Also I've tried the standard 'hadd' ROOT command. It seems to perform much more efficient and gives the identical combined file: with same input it takes ~ 30 minutes and uses 20 GB only.
I am using MG5_aMC_v2_6_0, gcc 5.3 and ROOT 5.34.
Is there a reason to use the custom script (combine_root.C) for combining output ROOT files? What do you think, would it make sense to switch to the standard ROOT tool 'hadd' instead?
Best regards,
Sasha.
Changed in mg5amcnlo: | |
assignee: | nobody → Stefano Frixione (stefano-frixione) |
Changed in mg5amcnlo: | |
status: | New → In Progress |
Dear Sasha,
thanks for reporting this.
Admittedly, that part of the package was never meant to be rock solid, and hasn't been
tested with different root versions, or with large files. Thus, in this case it is not
suprising that it doesn't work well.
In spite of being the original author of that piece of code, I can't remember why hadd
was not deemed a viable option. Maybe it was simply due to my poor understanding of root.
I'll investigate as soon as I'll have some time. In the meanwhile, I have nothing against
you using hadd, but I strongly suggest to test its results against those of the standard
package in a few cases where the latter is supposed to do well, and which are quick to run.
Best, Stefano.