Comment 8 for bug 1882254

Revision history for this message
Mike Hance (mhance) wrote :

Hi again,

I'm still playing around with this, and maybe this helps more:

From the log file, it looks like we're crashing with this exception in banner.py:

    def modify_init_cross(self, cross):
        """modify the init information with the associate cross-section"""
        assert isinstance(cross, dict)
# assert "all" in cross
        assert "init" in self

        cross = dict(cross)
        for key in cross.keys():
            if isinstance(key, str) and key.isdigit() and int(key) not in cross:
                cross[int(key)] = cross[key]

        all_lines = self["init"].split('\n')
        new_data = []
        new_data.append(all_lines[0])
        for i in range(1, len(all_lines)):
            line = all_lines[i]
            split = line.split()
            if len(split) == 4:
                xsec, xerr, xmax, pid = split
            else:
                new_data += all_lines[i:]
                break
            if int(pid) not in cross:
                raise Exception

I think this is complaining that we can't find a process ID in the init block from the LHE files that are being merged together. I see two partial LHE files in Events/run_01/:

atlas02 | test_guess_008_26 [51]: ls PROC_MSSM_SLHA2_0/Events/run_01/
total 4.1M
-rw-r--r--. 1 nobody users 44K Aug 5 21:51 run_01_tag_1_banner.txt
-rw-r--r--. 1 nobody users 1.1M Aug 5 21:52 partials0.lhe.gz
-rw-r--r--. 1 nobody users 3.0M Aug 5 21:52 partials1.lhe.gz

Looking at the init blocks for those two files:

<init>
2212 2212 6.500000e+03 6.500000e+03 0 0 260000 260000 -4 12
   +7.3565871e-03 +6.7896839e-05 +1.1766687e+01 11
   +4.7601446e-03 +8.1563638e-05 +6.9288533e+00 10
   +1.8776607e+00 +4.9773670e-03 +7.8777596e+00 13
   +2.1637021e-03 +1.3261012e-05 +6.0894384e+00 12
   +5.8831157e-01 +3.3798544e-03 +7.8188791e+00 15
   +1.1177685e+00 +5.1275003e-03 +7.6873225e+00 14
   +1.2539300e-04 +1.6456028e-06 +7.8213219e+00 17
   +1.9666514e-04 +3.5053501e-06 +7.7764568e+00 16
   +6.5569949e-05 +8.3775434e-07 +7.8094409e+00 18
   +2.3384995e+00 +6.0486786e-03 +7.8168534e+00 1
   +4.3344898e-01 +4.0951418e-03 +7.8023543e+00 3
   +1.4318232e+00 +8.0008571e-03 +7.8106911e+00 2
<generator name='MadGraph5_aMC@NLO' version='2.6.7'>please cite 1405.0301 </generator>
</init>

<init>
2212 2212 6.500000e+03 6.500000e+03 0 0 260000 260000 -4 6
   +4.8839995e-03 +4.5076354e-05 +3.0997368e+00 11
   +5.3667600e-03 +9.1957815e-05 +3.1045436e+00 10
   +1.8619488e+00 +4.9357178e-03 +3.1033280e+00 13
   +2.7757069e-03 +1.7011904e-05 +3.1044497e+00 12
   +9.1119429e-02 +8.2545211e-04 +3.0978857e+00 15
   +1.1358744e+00 +5.2105551e-03 +3.1064749e+00 14
<generator name='MadGraph5_aMC@NLO' version='2.6.7'>please cite 1405.0301 </generator>
</init>

so it looks like the process ID's are not unique, and not all process ID's are listed. (e.g. 10 is repeated, and 4 isn't listed anywhere.) I guess the exception arises when the unweighting routine is trying to find a line for a process ID that isn't in the init block.

In my case, I'm trying to run with:

generate p p > n2 x1+ / susystrong @1
add process p p > n2 x1+ j / susystrong @2
add process p p > n2 x1+ j j / susystrong @3
add process p p > n2 x1- / susystrong @4
add process p p > n2 x1- j / susystrong @5
add process p p > n2 x1- j j / susystrong @6
add process p p > n1 x1+ / susystrong @7
add process p p > n1 x1- / susystrong @8
add process p p > n1 x1+ j / susystrong @9
add process p p > n1 x1- j / susystrong @10
add process p p > n1 x1+ j j / susystrong @11
add process p p > n1 x1- j j / susystrong @12
add process p p > x1+ x1- / susystrong @13
add process p p > x1+ x1- j / susystrong @14
add process p p > x1+ x1- j j / susystrong @15
add process p p > n2 n1 / susystrong @16
add process p p > n2 n1 j / susystrong @17
add process p p > n2 n1 j j / susystrong @18

In looking even more closely at the partial LHE files, I see that in partials1.lhe.gz there are only x1+x1- events, and in partials0.lhe.gz I only see x1+n2 and x1+x1- events. So most of the processes are completely missing. (This isn't because of cross sections -- x1+n1 and x1+n2 should have comparable cross sections, but I don't see the former process at all.)

So it seems that something is going wrong with partitioning the subprocesses into these "partial" files. I couldn't quite figure out where this is happening, my best guess is that this is in madevent_interface.py, around line 3591, but I don't understand what that block is supposed to be doing.

In case it helps, I put my whole working area (zipped and tarred) on cernbox: https://cernbox.cern.ch/index.php/s/eaGhKfsWCxZ0gP6

It's big, obviously, but actually running all 18 processes with a few thousand events doesn't take so long, just a few minutes. It would be super helpful if we could track down this problem.

Thanks!

-Mike