Comment 2 for bug 1769271

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) wrote : Re: [Bug 1769271] [NEW] Reproducibility problem in 2.6.2 (in combine?)

Hi,

This is actually the case since a couple of version.
If you want to ensure 100% reproducibility you also have to set the python seed in top of the fortran seed that are set within the run_card.

Cheers,

Olivier

> On 4 May 2018, at 23:23, Zachary Marshall <email address hidden> wrote:
>
> Public bug reported:
>
> Dear authors,
>
> I'm running tests with MG5_aMC 2.6.2 - in case you have a CERN account, it's this one:
> /afs/cern.ch/sw/lcg/external/MCGenerators_lcgcmt67c/madgraph5amc/2.6.2.atlas/x86_64-slc6-gcc47-opt/
>
> We see some non-reproducibility, even when setting a random number seed.
> I attach a tarball of the cards directories so that you have the run,
> param, and proc cards. For the actual runs, iseed was set to 1234 (of
> course, the code resets it to 0 after the run).
>
> Looking at the LHE files, they seem to contain the same events, but in
> different order. Just diffing the two files, you can see an example
> right away:
>
> 1487a1488,1496
>> 6 1 +8.9242237e-05 3.75379100e+02 7.95049700e-03 1.05265500e-01
>> 21 -1 0 0 501 502 +0.0000000000e+00 +0.0000000000e+00 +2.1196628570e+01 2.1196628570e+01 0.0000000000e+00 0.0000e+00 -1.0000e+00
>> 2 -1 0 0 502 0 -0.0000000000e+00 -0.0000000000e+00 -1.6619322556e+03 1.6619322556e+03 0.0000000000e+00 0.0000e+00 -1.0000e+00
>> 24 2 1 2 0 0 +8.9075749711e+01 +3.5726986208e+01 -1.0898807218e+03 1.1239759236e+03 2.5743150990e+02 0.0000e+00 0.0000e+00
>> 1000023 1 3 3 0 0 +8.0116545705e+01 +1.1392922116e+02 -6.6081705506e+02 6.8029530983e+02 8.2000000000e+01 0.0000e+00 -1.0000e+00
>> 1000024 1 3 3 0 0 +8.9592040060e+00 -7.8202234949e+01 -4.2906366674e+02 4.4368061374e+02 8.1000000000e+01 0.0000e+00 1.0000e+00
>> 1 1 1 2 501 0 -8.9075749711e+01 -3.5726986208e+01 -5.5085490525e+02 5.5915296062e+02 0.0000000000e+00 0.0000e+00 -1.0000e+00
>> </event>
>> <event>
> 1506,1532c1515,1542
> < 6 1 +8.9242237e-05 3.75379100e+02 7.95049700e-03 1.05265500e-01
> < 21 -1 0 0 501 502 +0.0000000000e+00 +0.0000000000e+00 +2.1196628570e+01 2.1196628570e+01 0.0000000000e+00 0.0000e+00 -1.0000e+00
> < 2 -1 0 0 502 0 -0.0000000000e+00 -0.0000000000e+00 -1.6619322556e+03 1.6619322556e+03 0.0000000000e+00 0.0000e+00 -1.0000e+00
> < 24 2 1 2 0 0 +8.9075749711e+01 +3.5726986208e+01 -1.0898807218e+03 1.1239759236e+03 2.5743150990e+02 0.0000e+00 0.0000e+00
> < 1000023 1 3 3 0 0 +8.0116545705e+01 +1.1392922116e+02 -6.6081705506e+02 6.8029530983e+02 8.2000000000e+01 0.0000e+00 -1.0000e+00
> < 1000024 1 3 3 0 0 +8.9592040060e+00 -7.8202234949e+01 -4.2906366674e+02 4.4368061374e+02 8.1000000000e+01 0.0000e+00 1.0000e+00
> < 1 1 1 2 501 0 -8.9075749711e+01 -3.5726986208e+01 -5.5085490525e+02 5.5915296062e+02 0.0000000000e+00 0.0000e+00 -1.0000e+00
> < </event>
>
>
> I believe that's the same event, but note that the line numbers are ~20 apart -- this event is first in one file and fourth in the other. I believe this indicates some non-reproducibility or a race condition in combine somewhere.
>
> Because these events then enter programs like Pythia8 -- or even when
> they enter MadSpin -- the events are processed in order, the final
> events then appear random (different seeds are applied to the same
> particles). That means the events are effectively not reproducible when
> processed through, so this is an issue for our bug hunting and for
> various other reasons.
>
> If you could help us out with this it would be very much appreciated!
>
> Thanks,
> Zach
>
> ** Affects: mg5amcnlo
> Importance: Undecided
> Status: New
>
> ** Attachment added: "Cards directory for the jobs in question"
> https://bugs.launchpad.net/bugs/1769271/+attachment/5134110/+files/Cards_problem.tgz
>
> --
> You received this bug notification because you are subscribed to
> MadGraph5_aMC@NLO.
> https://bugs.launchpad.net/bugs/1769271
>
> Title:
> Reproducibility problem in 2.6.2 (in combine?)
>
> Status in MadGraph5_aMC@NLO:
> New
>
> Bug description:
> Dear authors,
>
> I'm running tests with MG5_aMC 2.6.2 - in case you have a CERN account, it's this one:
> /afs/cern.ch/sw/lcg/external/MCGenerators_lcgcmt67c/madgraph5amc/2.6.2.atlas/x86_64-slc6-gcc47-opt/
>
> We see some non-reproducibility, even when setting a random number
> seed. I attach a tarball of the cards directories so that you have
> the run, param, and proc cards. For the actual runs, iseed was set to
> 1234 (of course, the code resets it to 0 after the run).
>
> Looking at the LHE files, they seem to contain the same events, but in
> different order. Just diffing the two files, you can see an example
> right away:
>
> 1487a1488,1496
>> 6 1 +8.9242237e-05 3.75379100e+02 7.95049700e-03 1.05265500e-01
>> 21 -1 0 0 501 502 +0.0000000000e+00 +0.0000000000e+00 +2.1196628570e+01 2.1196628570e+01 0.0000000000e+00 0.0000e+00 -1.0000e+00
>> 2 -1 0 0 502 0 -0.0000000000e+00 -0.0000000000e+00 -1.6619322556e+03 1.6619322556e+03 0.0000000000e+00 0.0000e+00 -1.0000e+00
>> 24 2 1 2 0 0 +8.9075749711e+01 +3.5726986208e+01 -1.0898807218e+03 1.1239759236e+03 2.5743150990e+02 0.0000e+00 0.0000e+00
>> 1000023 1 3 3 0 0 +8.0116545705e+01 +1.1392922116e+02 -6.6081705506e+02 6.8029530983e+02 8.2000000000e+01 0.0000e+00 -1.0000e+00
>> 1000024 1 3 3 0 0 +8.9592040060e+00 -7.8202234949e+01 -4.2906366674e+02 4.4368061374e+02 8.1000000000e+01 0.0000e+00 1.0000e+00
>> 1 1 1 2 501 0 -8.9075749711e+01 -3.5726986208e+01 -5.5085490525e+02 5.5915296062e+02 0.0000000000e+00 0.0000e+00 -1.0000e+00
>> </event>
>> <event>
> 1506,1532c1515,1542
> < 6 1 +8.9242237e-05 3.75379100e+02 7.95049700e-03 1.05265500e-01
> < 21 -1 0 0 501 502 +0.0000000000e+00 +0.0000000000e+00 +2.1196628570e+01 2.1196628570e+01 0.0000000000e+00 0.0000e+00 -1.0000e+00
> < 2 -1 0 0 502 0 -0.0000000000e+00 -0.0000000000e+00 -1.6619322556e+03 1.6619322556e+03 0.0000000000e+00 0.0000e+00 -1.0000e+00
> < 24 2 1 2 0 0 +8.9075749711e+01 +3.5726986208e+01 -1.0898807218e+03 1.1239759236e+03 2.5743150990e+02 0.0000e+00 0.0000e+00
> < 1000023 1 3 3 0 0 +8.0116545705e+01 +1.1392922116e+02 -6.6081705506e+02 6.8029530983e+02 8.2000000000e+01 0.0000e+00 -1.0000e+00
> < 1000024 1 3 3 0 0 +8.9592040060e+00 -7.8202234949e+01 -4.2906366674e+02 4.4368061374e+02 8.1000000000e+01 0.0000e+00 1.0000e+00
> < 1 1 1 2 501 0 -8.9075749711e+01 -3.5726986208e+01 -5.5085490525e+02 5.5915296062e+02 0.0000000000e+00 0.0000e+00 -1.0000e+00
> < </event>
>
>
> I believe that's the same event, but note that the line numbers are ~20 apart -- this event is first in one file and fourth in the other. I believe this indicates some non-reproducibility or a race condition in combine somewhere.
>
> Because these events then enter programs like Pythia8 -- or even when
> they enter MadSpin -- the events are processed in order, the final
> events then appear random (different seeds are applied to the same
> particles). That means the events are effectively not reproducible
> when processed through, so this is an issue for our bug hunting and
> for various other reasons.
>
> If you could help us out with this it would be very much appreciated!
>
> Thanks,
> Zach
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/mg5amcnlo/+bug/1769271/+subscriptions