Scan terminates after view succesful points

Bug #1879641 reported by Jan Hajer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MadGraph5_aMC@NLO
Fix Released
Undecided
Unassigned

Bug Description

Hi Olivier,

I am still trying to complete my scan for right handed neutrinos using Ingrid.
I have found settings which work reasonably fast.
However, I attempt to run a scan with O(100) points.
It fails after 14 points seemingly successful with:

    INFO: Pythia8 shower finished after 1m12s.
      === Results Summary for run: run_14_decayed_1 tag: tag_1 ===

         Cross-section : 4.81 +- 0.01612 pb
         Nb of events : 10000

    INFO: storing files of previous run
    INFO: Storing Pythia8 files of previous run
    INFO: Done
    quit
    INFO:
    more information in /nfs/scratch/fynu/hajer/results/RUN/index.html
    quit

In every point I get the red line:

    CRITICAL: Branching ratio larger than one for 9900012

I have not investigated yet how bad this is.
And in the 13th run I have the error:

    Command "generate_events run_13" interrupted with error:
    UnboundLocalError : local variable 'i' referenced before assignment

The full traceback is attached.

Do you know what is happening?

Cheers,
Jan

Revision history for this message
Jan Hajer (jan.hajer) wrote :

Now with log file

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) wrote :

Hi,

Looks like a weird border effect.

Fixing the issue seems simple here:
=== modified file 'madgraph/various/lhe_parser.py'
--- madgraph/various/lhe_parser.py 2020-04-23 19:05:44 +0000
+++ madgraph/various/lhe_parser.py 2020-05-20 15:11:01 +0000
@@ -600,6 +600,7 @@
         """split the file in multiple file. Do not change the weight!"""

         nb_file = -1
+ i =0
         for i, event in enumerate(self):
             if (not (partition is None) and i==sum(partition[:nb_file+1])) or \
                                    (partition is None and i % nb_event == 0):

But I would bet that this is only a symptons of something else...
Will try to reproduce the run to see if I have an issue or not (could you try with the patch at the same time ;-)

CHeers,

Olivie

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) wrote :

I did not face any issue when running with the above patch for that particular benchmark.
At the same time, I did not face any issue without the patch either...

Cheers,

Olivier

Revision history for this message
Jan Hajer (jan.hajer) wrote :

Thanks,

I run 9 more points without the patch and are in the process of running the first one with patch.
I'll keep you updated if the problem reappears.

Cheers,
Jan

Revision history for this message
Jan Hajer (jan.hajer) wrote :

Hi,

I run into the same problem twice more.
Always with the same point.
Interestingly, it is not a point at the border but in the bulk of my scan.
One of the tracebacks looks almost identical to the last one.
The other one is new.
(I implemented your patch after starting the first of the runs.)
The lhe file in the run_13 folder looks innocent enough, (same size as the other points).
The lhe file in the run_13_decayed_1 folder is almost empty.

Cheers,
Jan

Revision history for this message
Jan Hajer (jan.hajer) wrote :

Other log

Revision history for this message
Jan Hajer (jan.hajer) wrote :

banner

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) wrote : Re: [Bug 1879641] Scan terminates after view succesful points

I do reproduce this issue on my laptop (but for the step #22 of the run)
So I will investigate (but it take a while to reproduce the bug and therefore to investigate)

Cheers,

Olivier

> On 21 May 2020, at 11:03, Jan Hajer <email address hidden> wrote:
>
> banner
>
> ** Attachment added: "run_13_decayed_1_tag_1_banner.txt"
> https://bugs.launchpad.net/mg5amcnlo/+bug/1879641/+attachment/5375060/+files/run_13_decayed_1_tag_1_banner.txt
>
> --
> You received this bug notification because you are subscribed to
> MadGraph5_aMC@NLO.
> https://bugs.launchpad.net/bugs/1879641
>
> Title:
> Scan terminates after view succesful points
>
> Status in MadGraph5_aMC@NLO:
> New
>
> Bug description:
> Hi Olivier,
>
> I am still trying to complete my scan for right handed neutrinos using Ingrid.
> I have found settings which work reasonably fast.
> However, I attempt to run a scan with O(100) points.
> It fails after 14 points seemingly successful with:
>
> INFO: Pythia8 shower finished after 1m12s.
> === Results Summary for run: run_14_decayed_1 tag: tag_1 ===
>
> Cross-section : 4.81 +- 0.01612 pb
> Nb of events : 10000
>
> INFO: storing files of previous run
> INFO: Storing Pythia8 files of previous run
> INFO: Done
> quit
> INFO:
> more information in /nfs/scratch/fynu/hajer/results/RUN/index.html
> quit
>
> In every point I get the red line:
>
> CRITICAL: Branching ratio larger than one for 9900012
>
> I have not investigated yet how bad this is.
> And in the 13th run I have the error:
>
> Command "generate_events run_13" interrupted with error:
> UnboundLocalError : local variable 'i' referenced before assignment
>
> The full traceback is attached.
>
> Do you know what is happening?
>
> Cheers,
> Jan
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/mg5amcnlo/+bug/1879641/+subscriptions

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) wrote :

Hi,

If I run your script with two executable
./bin/mg5
for the generation of the directory
and
./bin/madevent
for the scan

then I do not face that issue (at least so far). Now I can see that the number of thread used is growing (as well as the number of file left open). So this can create limit depending of the OS.

Cheers,

Olivier

> On 21 May 2020, at 20:30, Olivier Mattelaer <email address hidden> wrote:
>
> I do reproduce this issue on my laptop (but for the step #22 of the run)
> So I will investigate (but it take a while to reproduce the bug and therefore to investigate)
>
> Cheers,
>
> Olivier
>
>
>> On 21 May 2020, at 11:03, Jan Hajer <email address hidden> wrote:
>>
>> banner
>>
>> ** Attachment added: "run_13_decayed_1_tag_1_banner.txt"
>> https://bugs.launchpad.net/mg5amcnlo/+bug/1879641/+attachment/5375060/+files/run_13_decayed_1_tag_1_banner.txt
>>
>> --
>> You received this bug notification because you are subscribed to
>> MadGraph5_aMC@NLO.
>> https://bugs.launchpad.net/bugs/1879641
>>
>> Title:
>> Scan terminates after view succesful points
>>
>> Status in MadGraph5_aMC@NLO:
>> New
>>
>> Bug description:
>> Hi Olivier,
>>
>> I am still trying to complete my scan for right handed neutrinos using Ingrid.
>> I have found settings which work reasonably fast.
>> However, I attempt to run a scan with O(100) points.
>> It fails after 14 points seemingly successful with:
>>
>> INFO: Pythia8 shower finished after 1m12s.
>> === Results Summary for run: run_14_decayed_1 tag: tag_1 ===
>>
>> Cross-section : 4.81 +- 0.01612 pb
>> Nb of events : 10000
>>
>> INFO: storing files of previous run
>> INFO: Storing Pythia8 files of previous run
>> INFO: Done
>> quit
>> INFO:
>> more information in /nfs/scratch/fynu/hajer/results/RUN/index.html
>> quit
>>
>> In every point I get the red line:
>>
>> CRITICAL: Branching ratio larger than one for 9900012
>>
>> I have not investigated yet how bad this is.
>> And in the 13th run I have the error:
>>
>> Command "generate_events run_13" interrupted with error:
>> UnboundLocalError : local variable 'i' referenced before assignment
>>
>> The full traceback is attached.
>>
>> Do you know what is happening?
>>
>> Cheers,
>> Jan
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/mg5amcnlo/+bug/1879641/+subscriptions
>

Revision history for this message
Jan Hajer (jan.hajer) wrote :

Hi,

so its some kind of resource management bug, where Madgraph forgets to free unused resources?
If you think it is worthwhile for me to try your two binary workaround, could to tell me which line I have to adjust in order to run with madevent?

Cheers,
Jan

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) wrote :

The first script should be:

import model SM_HeavyN_NLO
define lep = e+ e- mu+ mu- ta+ ta-
define ferm = lep ve ve~ vm vm~ vt vt~ u u~ d d~ s s~ c c~ b b~
generate p p > ferm n1
output MAPP_6_3

and the second:
launch # only this line is modified
shower = PY8
madspin = none
decay n1 > ferm ferm ferm
set ebeam1 7000
set ebeam2 7000
set nevents 10000
set ptl 0.1
set eta_min_pdg {9900012: -3.44}
set eta_max_pdg {9900012: -1.49}
set mpi False
set use_syst False
set time_of_flight 0.1
set mN2 1000000
set mN3 1000000
set VeN1 0
set VtaN1 0
set VeN2 0
set VmuN2 0
set VtaN2 0
set VeN3 0
set VmuN3 0
set VtaN3 0
set use_syst False
# U2 0.010000000000000002
# M 1.
set mN1 1.
set VmuN1 0.1
compute_widths --body_decay=3.0025 n1
launch
...

Cheers,

Olivier

> On 22 May 2020, at 15:08, Jan Hajer <email address hidden> wrote:
>
> Hi,
>
> so its some kind of resource management bug, where Madgraph forgets to free unused resources?
> If you think it is worthwhile for me to try your two binary workaround, could to tell me which line I have to adjust in order to run with madevent?
>
> Cheers,
> Jan
>
> --
> You received this bug notification because you are subscribed to
> MadGraph5_aMC@NLO.
> https://bugs.launchpad.net/bugs/1879641
>
> Title:
> Scan terminates after view succesful points
>
> Status in MadGraph5_aMC@NLO:
> New
>
> Bug description:
> Hi Olivier,
>
> I am still trying to complete my scan for right handed neutrinos using Ingrid.
> I have found settings which work reasonably fast.
> However, I attempt to run a scan with O(100) points.
> It fails after 14 points seemingly successful with:
>
> INFO: Pythia8 shower finished after 1m12s.
> === Results Summary for run: run_14_decayed_1 tag: tag_1 ===
>
> Cross-section : 4.81 +- 0.01612 pb
> Nb of events : 10000
>
> INFO: storing files of previous run
> INFO: Storing Pythia8 files of previous run
> INFO: Done
> quit
> INFO:
> more information in /nfs/scratch/fynu/hajer/results/RUN/index.html
> quit
>
> In every point I get the red line:
>
> CRITICAL: Branching ratio larger than one for 9900012
>
> I have not investigated yet how bad this is.
> And in the 13th run I have the error:
>
> Command "generate_events run_13" interrupted with error:
> UnboundLocalError : local variable 'i' referenced before assignment
>
> The full traceback is attached.
>
> Do you know what is happening?
>
> Cheers,
> Jan
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/mg5amcnlo/+bug/1879641/+subscriptions

Revision history for this message
Jan Hajer (jan.hajer) wrote :

Hi,

using this technique I can generate 100 points.
But the decayed folder does not contain any pythia events only the unweighted events.
Do you see the same?

Cheers,
Jan

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) wrote :

No I had the pythia created normally.
Likely that the configuration path of pythia is not an absolute path or something like that

You can check in Events/me5_configuration.txt if a path is set or not.
if not you also can check the path to MG5 where it can be set in the associate
input/mg5_configuration.txt

Cheers,

Olivier

> On 24 May 2020, at 16:34, Jan Hajer <email address hidden> wrote:
>
> Hi,
>
> using this technique I can generate 100 points.
> But the decayed folder does not contain any pythia events only the unweighted events.
> Do you see the same?
>
> Cheers,
> Jan
>
> --
> You received this bug notification because you are subscribed to
> MadGraph5_aMC@NLO.
> https://bugs.launchpad.net/bugs/1879641
>
> Title:
> Scan terminates after view succesful points
>
> Status in MadGraph5_aMC@NLO:
> New
>
> Bug description:
> Hi Olivier,
>
> I am still trying to complete my scan for right handed neutrinos using Ingrid.
> I have found settings which work reasonably fast.
> However, I attempt to run a scan with O(100) points.
> It fails after 14 points seemingly successful with:
>
> INFO: Pythia8 shower finished after 1m12s.
> === Results Summary for run: run_14_decayed_1 tag: tag_1 ===
>
> Cross-section : 4.81 +- 0.01612 pb
> Nb of events : 10000
>
> INFO: storing files of previous run
> INFO: Storing Pythia8 files of previous run
> INFO: Done
> quit
> INFO:
> more information in /nfs/scratch/fynu/hajer/results/RUN/index.html
> quit
>
> In every point I get the red line:
>
> CRITICAL: Branching ratio larger than one for 9900012
>
> I have not investigated yet how bad this is.
> And in the 13th run I have the error:
>
> Command "generate_events run_13" interrupted with error:
> UnboundLocalError : local variable 'i' referenced before assignment
>
> The full traceback is attached.
>
> Do you know what is happening?
>
> Cheers,
> Jan
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/mg5amcnlo/+bug/1879641/+subscriptions

Revision history for this message
Jan Hajer (jan.hajer) wrote :

Hi,

when I change the pythia path into a absolute path.
Pythia runs, but the madevent scan aborts with:

  File "/nfs/scratch/fynu/hajer/mg5amcnlo/MAPP_8/bin/internal/madevent_interface.py", line 3854, in do_shower
    postcmd=False, printcmd=False)
  File "/nfs/scratch/fynu/hajer/mg5amcnlo/MAPP_8/bin/internal/extended_cmd.py", line 1544, in exec_cmd
    stop = Cmd.onecmd_orig(current_interface, line, **opt)
  File "/nfs/scratch/fynu/hajer/mg5amcnlo/MAPP_8/bin/internal/extended_cmd.py", line 1464, in onecmd_orig
    return func(arg, **opt)
  File "/nfs/scratch/fynu/hajer/mg5amcnlo/MAPP_8/bin/internal/madevent_interface.py", line 4469, in do_pythia8
    %(len(partition),n_splits))
MadGraph5Error: Error during lhe file splitting. Expected 1 files but obtained 0.

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) wrote :

at the first step of the scan?

Olivier

> On 25 May 2020, at 10:14, Jan Hajer <email address hidden> wrote:
>
> Error during lhe file splitting.

Revision history for this message
Jan Hajer (jan.hajer) wrote :

The second, the first look good.

Revision history for this message
Jan Hajer (jan.hajer) wrote :

Hi,

I encountered another problem.
My lines

set eta_min_pdg {9900012: -3.44}
set eta_max_pdg {9900012: -1.49}

result in

  {'9900012': -3.44} = eta_min_pdg ! rap cut for other particles (use pdg code). Applied on particle and anti-particle
  {'9900012': -1.49} = eta_max_pdg ! rap cut for other particles (syntax e.g. {6: 2.5, 23: 5})

When I plot the decay position of the heavy neutrino (9900012) I expected to see a cylinder in position space.
However, I can not see any trace of this cut, except for a preference for the +/- z direction I suspect coming from the PDF of the initial protons.
Am I using this feature wrong?

Cheers,
Jan

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) wrote :

Hi,

I was indeed surprised by your number on those cuts.
I have checked the code
notgood=(etamax(i).ge.0.and.abs(rap(p(0,i))) .gt. etamax(i)).or.
     & (abs(rap(p(0,i))) .lt. etamin(i))

and indeed negative number for those quantity are interpreted as "nocut"
The point is that we put a cut on the absolute value (which therefore is always positive)

Cheers,

Olivier

Revision history for this message
Jan Hajer (jan.hajer) wrote : Re: [Bug 1879641] Re: Scan terminates after view succesful points

Hi,

I will just move my detector to the positive direction.
But how do you handle both CMS and LHCb with this code?

Cheers,
Jan

On Thu, May 28, 2020 at 2:51 PM Olivier Mattelaer
<email address hidden> wrote:
>
> Hi,
>
> I was indeed surprised by your number on those cuts.
> I have checked the code
> notgood=(etamax(i).ge.0.and.abs(rap(p(0,i))) .gt. etamax(i)).or.
> & (abs(rap(p(0,i))) .lt. etamin(i))
>
> and indeed negative number for those quantity are interpreted as "nocut"
> The point is that we put a cut on the absolute value (which therefore is always positive)
>
> Cheers,
>
> Olivier
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1879641
>
> Title:
> Scan terminates after view succesful points
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/mg5amcnlo/+bug/1879641/+subscriptions

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) wrote : Re: [Bug 1879641] Scan terminates after view succesful points

I guess that they generate with symmetric cut and then after that either they symmetrize the data and/or filter out event.
This is actually the first time that someone mention this.

Cheers,

Olivier

> On 28 May 2020, at 17:46, Jan Hajer <email address hidden> wrote:
>
> Hi,
>
> I will just move my detector to the positive direction.
> But how do you handle both CMS and LHCb with this code?
>
> Cheers,
> Jan
>
> On Thu, May 28, 2020 at 2:51 PM Olivier Mattelaer
> <email address hidden> wrote:
>>
>> Hi,
>>
>> I was indeed surprised by your number on those cuts.
>> I have checked the code
>> notgood=(etamax(i).ge.0.and.abs(rap(p(0,i))) .gt. etamax(i)).or.
>> & (abs(rap(p(0,i))) .lt. etamin(i))
>>
>> and indeed negative number for those quantity are interpreted as "nocut"
>> The point is that we put a cut on the absolute value (which therefore is always positive)
>>
>> Cheers,
>>
>> Olivier
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/1879641
>>
>> Title:
>> Scan terminates after view succesful points
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/mg5amcnlo/+bug/1879641/+subscriptions
>
> --
> You received this bug notification because you are subscribed to
> MadGraph5_aMC@NLO.
> https://bugs.launchpad.net/bugs/1879641
>
> Title:
> Scan terminates after view succesful points
>
> Status in MadGraph5_aMC@NLO:
> New
>
> Bug description:
> Hi Olivier,
>
> I am still trying to complete my scan for right handed neutrinos using Ingrid.
> I have found settings which work reasonably fast.
> However, I attempt to run a scan with O(100) points.
> It fails after 14 points seemingly successful with:
>
> INFO: Pythia8 shower finished after 1m12s.
> === Results Summary for run: run_14_decayed_1 tag: tag_1 ===
>
> Cross-section : 4.81 +- 0.01612 pb
> Nb of events : 10000
>
> INFO: storing files of previous run
> INFO: Storing Pythia8 files of previous run
> INFO: Done
> quit
> INFO:
> more information in /nfs/scratch/fynu/hajer/results/RUN/index.html
> quit
>
> In every point I get the red line:
>
> CRITICAL: Branching ratio larger than one for 9900012
>
> I have not investigated yet how bad this is.
> And in the 13th run I have the error:
>
> Command "generate_events run_13" interrupted with error:
> UnboundLocalError : local variable 'i' referenced before assignment
>
> The full traceback is attached.
>
> Do you know what is happening?
>
> Cheers,
> Jan
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/mg5amcnlo/+bug/1879641/+subscriptions

Revision history for this message
Jan Hajer (jan.hajer) wrote :
Download full text (3.9 KiB)

I will simply duplicate my detector until it covers the whole region
allowed by symmetry.

I still have not managed to finish any of the scans.
At the moment I generate the process folder within my madgraph installation.
Then I start a second script using madevent from within this folder
(after changing relative paths into absolute paths in the config
file.)
But it still crashes with
    MadGraph5Error : Error during lhe file splitting. Expected 1 files
but obtained 0.
after 13 points.
I was able to get ~30 points with an older installation of madgraph.
Were you able to narrow the problem down?

Cheers,
Jan

On Thu, May 28, 2020 at 6:25 PM Olivier Mattelaer
<email address hidden> wrote:
>
> I guess that they generate with symmetric cut and then after that either they symmetrize the data and/or filter out event.
> This is actually the first time that someone mention this.
>
> Cheers,
>
> Olivier
>
> > On 28 May 2020, at 17:46, Jan Hajer <email address hidden> wrote:
> >
> > Hi,
> >
> > I will just move my detector to the positive direction.
> > But how do you handle both CMS and LHCb with this code?
> >
> > Cheers,
> > Jan
> >
> > On Thu, May 28, 2020 at 2:51 PM Olivier Mattelaer
> > <email address hidden> wrote:
> >>
> >> Hi,
> >>
> >> I was indeed surprised by your number on those cuts.
> >> I have checked the code
> >> notgood=(etamax(i).ge.0.and.abs(rap(p(0,i))) .gt. etamax(i)).or.
> >> & (abs(rap(p(0,i))) .lt. etamin(i))
> >>
> >> and indeed negative number for those quantity are interpreted as "nocut"
> >> The point is that we put a cut on the absolute value (which therefore is always positive)
> >>
> >> Cheers,
> >>
> >> Olivier
> >>
> >> --
> >> You received this bug notification because you are subscribed to the bug
> >> report.
> >> https://bugs.launchpad.net/bugs/1879641
> >>
> >> Title:
> >> Scan terminates after view succesful points
> >>
> >> To manage notifications about this bug go to:
> >> https://bugs.launchpad.net/mg5amcnlo/+bug/1879641/+subscriptions
> >
> > --
> > You received this bug notification because you are subscribed to
> > MadGraph5_aMC@NLO.
> > https://bugs.launchpad.net/bugs/1879641
> >
> > Title:
> > Scan terminates after view succesful points
> >
> > Status in MadGraph5_aMC@NLO:
> > New
> >
> > Bug description:
> > Hi Olivier,
> >
> > I am still trying to complete my scan for right handed neutrinos using Ingrid.
> > I have found settings which work reasonably fast.
> > However, I attempt to run a scan with O(100) points.
> > It fails after 14 points seemingly successful with:
> >
> > INFO: Pythia8 shower finished after 1m12s.
> > === Results Summary for run: run_14_decayed_1 tag: tag_1 ===
> >
> > Cross-section : 4.81 +- 0.01612 pb
> > Nb of events : 10000
> >
> > INFO: storing files of previous run
> > INFO: Storing Pythia8 files of previous run
> > INFO: Done
> > quit
> > INFO:
> > more information in /nfs/scratch/fynu/hajer/results/RUN/index.html
> > quit
> >
> > In every point I get the red line:
> >
> > CRITICAL: Branching ratio larger than one for 9900012
> >
> > I have not investigated yet h...

Read more...

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) wrote :

2.9.6 should include additional fix to limit such issue. So I will close this issue now.

Thanks,

Olivier

Changed in mg5amcnlo:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.