MadGraph5_aMC@NLO

MadSpin/Pythia8 fails in some processes on lxplus7

Bug #1788615 reported by James Howarth on 2018-08-23

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	MadGraph5_aMC@NLO	Invalid	Undecided	Unassigned

Bug Description

When trying to generate some NLO processes on lxplus7 machines (running CentOS) and MadGraph5_aMC@NLO version v2_6_3_2 the run appears to fail at the end of the MadSpin run and before Pythia8.

The setup is p p > t t~ [QCD] with madspin mode full and Pythia8 out of the box. The only things that are changed are the top mass (173 -> 172.5) and random seed, as well as forcing leptonic W decays in the madspin syntax. The jobs seem to work fine for 1000 events but any higher (e.g. 40000 or 100000) and they crash with the cryptic error:

Command "launch auto " interrupted with error:
OSError : [Errno 39] Directory not empty: '/afs/cern.ch/work/j/jhowarth/private/13TeV/MadGraph/MG5_aMC_v2_6_3_2/TTBAR_NLO_SPIN/full_me/SubProcesses/P2_gu_ttxu_t_bwp_wp_epve_tx_bxwm_wm_emvex'

Interestingly, an identical setup but with process "p p > t t~ z [QCD]" seems work, whereas ttgamma also fails (presumably it's not actually process related but something to do with memory requirements or something system related.

I also tried running Pythia8 multi-core or single core and it made no difference (I think it fails before I get to this point).

I have attached all the relevant cards as well as the log files to this ticket. Any help would be greatly appreciated.

Revision history for this message

James Howarth (jhowarth) wrote on 2018-08-23:

magraph_output.tar.gz Edit (36.1 KiB, application/x-tar)

Revision history for this message

Olivier Mattelaer (olivier-mattelaer) wrote on 2018-08-23:

Hi,

The problem seems that MadSpin can not clean the temporary directory that it creates.
This should be a filesystem issue (or maybe a python specific version issue) since we basically do a
rm -rf (the python version of that to be precise).

Not sure what we can do here...
One solution is to prevent MS to remove such directories.
You can pass in gridpack mode for that (with the option ms_dir)

Cheers,

Olivier

Revision history for this message

James Howarth (jhowarth) wrote on 2018-08-24:

Hi Oliver,

I've seen that this issue has come up before with MadSpin, perhaps it would be better to either not remove these directories by default or to use a more version/filesystem safe way of removing directories? Is there no way to make a quick patch for this? (or could you point me to where this happens in madspin and I can attempt to hack it myself?)

As for passing MS events in gridpack mode, could you be a bit more specific on how to do this? Which options need to be specified in the runcard/madspin card/pythia cards or do I need to run each step of the generation by hand?

Cheers,

Jay

Revision history for this message

Olivier Mattelaer (olivier-mattelaer) wrote on 2018-08-24: Re: [Bug 1788615] Re: MadSpin/Pythia8 fails in some processes on lxplus7

Download full text (3.4 KiB)

Hi,

> would be better to either not remove these directories by default

This is a no go.

> or to
> use a more version/filesystem safe way of removing directories?

We are using the official version of Python for that...
I do not see how we can improve on that.

> Is there
> no way to make a quick patch for this? (or could you point me to where
> this happens in madspin and I can attempt to hack it myself?)

You have the debug file, it should point the line where the remove of the directory happens.
you can bypass such removal if you want (do not run multiple times madspin in that case since that can creates troubles)

> As for passing MS events in gridpack mode, could you be a bit more
> specific on how to do this? Which options need to be specified in the
> runcard/madspin card/pythia cards or do I need to run each step of the
> generation by hand?

This is only in the madspin_card
you have to add a line
set ms_dir PATH
see here for more details: https://cp3.irmp.ucl.ac.be/projects/madgraph/wiki/MadSpin

Cheers,

Olivier
> On 24 Aug 2018, at 12:33, James Howarth <email address hidden> wrote:
>
> Hi Oliver,
>
> I've seen that this issue has come up before with MadSpin, perhaps it
> would be better to either not remove these directories by default or to
> use a more version/filesystem safe way of removing directories? Is there
> no way to make a quick patch for this? (or could you point me to where
> this happens in madspin and I can attempt to hack it myself?)
>
> As for passing MS events in gridpack mode, could you be a bit more
> specific on how to do this? Which options need to be specified in the
> runcard/madspin card/pythia cards or do I need to run each step of the
> generation by hand?
>
> Cheers,
>
> Jay
>
> --
> You received this bug notification because you are subscribed to
> MadGraph5_aMC@NLO.
> https://bugs.launchpad.net/bugs/1788615
>
> Title:
> MadSpin/Pythia8 fails in some processes on lxplus7
>
> Status in MadGraph5_aMC@NLO:
> New
>
> Bug description:
> When trying to generate some NLO processes on lxplus7 machines
> (running CentOS) and MadGraph5_aMC@NLO version v2_6_3_2 the run
> appears to fail at the end of the MadSpin run and before Pythia8.
>
> The setup is p p > t t~ [QCD] with madspin mode full and Pythia8 out
> of the box. The only things that are changed are the top mass (173 ->
> 172.5) and random seed, as well as forcing leptonic W decays in the
> madspin syntax. The jobs seem to work fine for 1000 events but any
> higher (e.g. 40000 or 100000) and they crash with the cryptic error:
>
> Command "launch auto " interrupted with error:
> OSError : [Errno 39] Directory not empty: '/afs/cern.ch/work/j/jhowarth/private/13TeV/MadGraph/MG5_aMC_v2_6_3_2/TTBAR_NLO_SPIN/full_me/SubProcesses/P2_gu_ttxu_t_bwp_wp_epve_tx_bxwm_wm_emvex'
>
> Interestingly, an identical setup but with process "p p > t t~ z
> [QCD]" seems work, whereas ttgamma also fails (presumably it's not
> actually process related but something to do with memory requirements
> or something system related.
>
> I also tried running Pythia8 multi-core or single core and it made no
> difference (I think it fails before I get to t...

Hi,

> would be better to either not remove these directories by default

This is a no go.

> or to
> use a more version/filesystem safe way of removing directories?

We are using the official version of Python for that... 
I do not see how we can improve on that.

>  Is there
> no way to make a quick patch for this? (or could you point me to where
> this happens in madspin and I can attempt to hack it myself?)

This is only in the madspin_card
you have to add a line
set ms_dir PATH
see here for more details: https://cp3.irmp.ucl.ac.be/projects/madgraph/wiki/MadSpin

Cheers,

Olivier
> On 24 Aug 2018, at 12:33, James Howarth <jhowarth@cern.ch> wrote:
> 
> Hi Oliver,
> 
> I've seen that this issue has come up before with MadSpin, perhaps it
> would be better to either not remove these directories by default or to
> use a more version/filesystem safe way of removing directories? Is there
> no way to make a quick patch for this? (or could you point me to where
> this happens in madspin and I can attempt to hack it myself?)
> 
> As for passing MS events in gridpack mode, could you be a bit more
> specific on how to do this? Which options need to be specified in the
> runcard/madspin card/pythia cards or do I need to run each step of the
> generation by hand?
> 
> Cheers,
> 
> Jay
> 
> -- 
> You received this bug notification because you are subscribed to
> MadGraph5_aMC@NLO.
> https://bugs.launchpad.net/bugs/1788615
> 
> Title:
>  MadSpin/Pythia8 fails in some processes on lxplus7
> 
> Status in MadGraph5_aMC@NLO:
>  New
> 
> Bug description:
>  When trying to generate some NLO processes on lxplus7 machines
>  (running CentOS) and MadGraph5_aMC@NLO version v2_6_3_2 the run
>  appears to fail at the end of the MadSpin run and before Pythia8.
> 
>  The setup is p p > t t~ [QCD] with madspin mode full and Pythia8 out
>  of the box. The only things that are changed are the top mass (173 ->
>  172.5) and random seed, as well as forcing leptonic W decays in the
>  madspin syntax. The jobs seem to work fine for 1000 events but any
>  higher (e.g. 40000 or 100000) and they crash with the cryptic error:
> 
>  Command "launch auto " interrupted with error:
>  OSError : [Errno 39] Directory not empty: '/afs/cern.ch/work/j/jhowarth/private/13TeV/MadGraph/MG5_aMC_v2_6_3_2/TTBAR_NLO_SPIN/full_me/SubProcesses/P2_gu_ttxu_t_bwp_wp_epve_tx_bxwm_wm_emvex'
> 
>  Interestingly, an identical setup but with process "p p > t t~ z
>  [QCD]" seems work, whereas ttgamma also fails (presumably it's not
>  actually process related but something to do with memory requirements
>  or something system related.
> 
>  I also tried running Pythia8 multi-core or single core and it made no
>  difference (I think it fails before I get to this point).
> 
>  I have attached all the relevant cards as well as the log files to
>  this ticket. Any help would be greatly appreciated.
> 
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/mg5amcnlo/+bug/1788615/+subscriptions

Olivier Mattelaer (olivier-mattelaer) on 2018-09-28

Changed in mg5amcnlo:
status:	New → Invalid

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

magraph_output.tar.gz Edit

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.