Comment 5 for bug 1803202

Revision history for this message
Valentin Hirschi (valentin-hirschi) wrote :

The issue comes from the parallelisation of the Pythia8 shower, which unfortunately has to be managed by MadGraph since Pythia8 doesn't offer this functionality.

Notice that I turn on the multicore parallel Pythia8 shower only when a sufficiently large number of events needs to be showered, and this is probably why you don't find any problem with 100 events. You can test this by running the `shower pythia8 run_XX` command right *after* having specified `set nb_core 1` which disables the PY8 parallelisation and in effect should "solve" your issue.

The way I implemented this parallelisation is by splitting the original .lhe file into several smaller ones, shower them on separate cores, and the recombine the HEPMC files generated (which is a delicate task because these files are large).

Therefore, to help us diagnose what the problem is it would be helpful if you could investigate why the Pythia8 shower on these individual smaller LHE files crashed whereas it didn't on the larger file.
You can do so by monitoring more closely these run in the 'Events/run_XX' directory, where you can find the './run_shower.sh' script that would allow you to launch PY8 shower by hand, exactly like MadGraph does and also, when the parallelisation is active, you will find multiple folders named 'PY8_parallelisation' in that 'Events/run_xx' directory, in which you will find again a script for running the shower but also *PY8 log files* of theses runs on the split smaller LHE file which failed.
Looking into these "split logs" to understand why the PY8 shower failed in this "parallelised case", and possibly sharing here the interesting part of these logs, could help us fix this issue of PY8 parallelisation.

Thank you for your help with this and sorry for the late reply.