MadSpin/Pythia8 fails in some processes on lxplus7

Bug #1788615 reported by James Howarth
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MadGraph5_aMC@NLO
Invalid
Undecided
Unassigned

Bug Description

When trying to generate some NLO processes on lxplus7 machines (running CentOS) and MadGraph5_aMC@NLO version v2_6_3_2 the run appears to fail at the end of the MadSpin run and before Pythia8.

The setup is p p > t t~ [QCD] with madspin mode full and Pythia8 out of the box. The only things that are changed are the top mass (173 -> 172.5) and random seed, as well as forcing leptonic W decays in the madspin syntax. The jobs seem to work fine for 1000 events but any higher (e.g. 40000 or 100000) and they crash with the cryptic error:

Command "launch auto " interrupted with error:
OSError : [Errno 39] Directory not empty: '/afs/cern.ch/work/j/jhowarth/private/13TeV/MadGraph/MG5_aMC_v2_6_3_2/TTBAR_NLO_SPIN/full_me/SubProcesses/P2_gu_ttxu_t_bwp_wp_epve_tx_bxwm_wm_emvex'

Interestingly, an identical setup but with process "p p > t t~ z [QCD]" seems work, whereas ttgamma also fails (presumably it's not actually process related but something to do with memory requirements or something system related.

I also tried running Pythia8 multi-core or single core and it made no difference (I think it fails before I get to this point).

I have attached all the relevant cards as well as the log files to this ticket. Any help would be greatly appreciated.

Revision history for this message
James Howarth (jhowarth) wrote :
Revision history for this message
Olivier Mattelaer (olivier-mattelaer) wrote :

Hi,

The problem seems that MadSpin can not clean the temporary directory that it creates.
This should be a filesystem issue (or maybe a python specific version issue) since we basically do a
rm -rf (the python version of that to be precise).

Not sure what we can do here...
One solution is to prevent MS to remove such directories.
You can pass in gridpack mode for that (with the option ms_dir)

Cheers,

Olivier

Revision history for this message
James Howarth (jhowarth) wrote :

Hi Oliver,

I've seen that this issue has come up before with MadSpin, perhaps it would be better to either not remove these directories by default or to use a more version/filesystem safe way of removing directories? Is there no way to make a quick patch for this? (or could you point me to where this happens in madspin and I can attempt to hack it myself?)

As for passing MS events in gridpack mode, could you be a bit more specific on how to do this? Which options need to be specified in the runcard/madspin card/pythia cards or do I need to run each step of the generation by hand?

Cheers,

Jay

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) wrote : Re: [Bug 1788615] Re: MadSpin/Pythia8 fails in some processes on lxplus7
Download full text (3.4 KiB)

Hi,

> would be better to either not remove these directories by default

This is a no go.

> or to
> use a more version/filesystem safe way of removing directories?

We are using the official version of Python for that...
I do not see how we can improve on that.

> Is there
> no way to make a quick patch for this? (or could you point me to where
> this happens in madspin and I can attempt to hack it myself?)

You have the debug file, it should point the line where the remove of the directory happens.
you can bypass such removal if you want (do not run multiple times madspin in that case since that can creates troubles)

> As for passing MS events in gridpack mode, could you be a bit more
> specific on how to do this? Which options need to be specified in the
> runcard/madspin card/pythia cards or do I need to run each step of the
> generation by hand?

This is only in the madspin_card
you have to add a line
set ms_dir PATH
see here for more details: https://cp3.irmp.ucl.ac.be/projects/madgraph/wiki/MadSpin

Cheers,

Olivier
> On 24 Aug 2018, at 12:33, James Howarth <email address hidden> wrote:
>
> Hi Oliver,
>
> I've seen that this issue has come up before with MadSpin, perhaps it
> would be better to either not remove these directories by default or to
> use a more version/filesystem safe way of removing directories? Is there
> no way to make a quick patch for this? (or could you point me to where
> this happens in madspin and I can attempt to hack it myself?)
>
> As for passing MS events in gridpack mode, could you be a bit more
> specific on how to do this? Which options need to be specified in the
> runcard/madspin card/pythia cards or do I need to run each step of the
> generation by hand?
>
> Cheers,
>
> Jay
>
> --
> You received this bug notification because you are subscribed to
> MadGraph5_aMC@NLO.
> https://bugs.launchpad.net/bugs/1788615
>
> Title:
> MadSpin/Pythia8 fails in some processes on lxplus7
>
> Status in MadGraph5_aMC@NLO:
> New
>
> Bug description:
> When trying to generate some NLO processes on lxplus7 machines
> (running CentOS) and MadGraph5_aMC@NLO version v2_6_3_2 the run
> appears to fail at the end of the MadSpin run and before Pythia8.
>
> The setup is p p > t t~ [QCD] with madspin mode full and Pythia8 out
> of the box. The only things that are changed are the top mass (173 ->
> 172.5) and random seed, as well as forcing leptonic W decays in the
> madspin syntax. The jobs seem to work fine for 1000 events but any
> higher (e.g. 40000 or 100000) and they crash with the cryptic error:
>
> Command "launch auto " interrupted with error:
> OSError : [Errno 39] Directory not empty: '/afs/cern.ch/work/j/jhowarth/private/13TeV/MadGraph/MG5_aMC_v2_6_3_2/TTBAR_NLO_SPIN/full_me/SubProcesses/P2_gu_ttxu_t_bwp_wp_epve_tx_bxwm_wm_emvex'
>
> Interestingly, an identical setup but with process "p p > t t~ z
> [QCD]" seems work, whereas ttgamma also fails (presumably it's not
> actually process related but something to do with memory requirements
> or something system related.
>
> I also tried running Pythia8 multi-core or single core and it made no
> difference (I think it fails before I get to t...

Read more...

Changed in mg5amcnlo:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.