When generating an NLO process
(high mass drell yan + yets in this case, cards here
https://github.com/cms-sw/genproductions/tree/master/bin/MadGraph5_aMCatNLO/cards/production/13TeV/dyellell012j_5f_NLO_FXFX_M3000
)
One of the jobs for the first step Setting up grid fails with errors as below.
Looking at the directory for the specific job indicated, I notice something strange, which is that log_MINT0.txt (attached) has a lot of binary junk at the beginning. Not sure if this is related to the failure or not.
[1;31mCRITICAL: Fail to run correctly job 655476238.
with option: {'log': None, 'stdout': None, 'argument': ['2', 'F', '0'], 'nb_submit': 5, 'stderr': None, 'prog': 'ajob7', 'output_files': ['GF7'], 'time_check': 1431752459.19803, 'cwd': '/afs/cern.ch/work/k/kplee/private/GridpackProduction/genproductions/bin/MadGraph5_aMCatNLO/dyellell012j_5f_NLO_FXFX_M3000/dyellell012j_5f_NLO_FXFX_M3000_gridpack/work/processtmp/SubProcesses/P2_dxg_epemdxg', 'required_output': ['GF7/log_MINT0.txt', 'GF7/results.dat'], 'input_files': ['/afs/cern.ch/work/k/kplee/private/GridpackProduction/genproductions/bin/MadGraph5_aMCatNLO/dyellell012j_5f_NLO_FXFX_M3000/dyellell012j_5f_NLO_FXFX_M3000_gridpack/work/processtmp/MGMEVersion.txt', '/afs/cern.ch/work/k/kplee/private/GridpackProduction/genproductions/bin/MadGraph5_aMCatNLO/dyellell012j_5f_NLO_FXFX_M3000/dyellell012j_5f_NLO_FXFX_M3000_gridpack/work/processtmp/SubProcesses/randinit', '/afs/cern.ch/work/k/kplee/private/GridpackProduction/genproductions/bin/MadGraph5_aMCatNLO/dyellell012j_5f_NLO_FXFX_M3000/dyellell012j_5f_NLO_FXFX_M3000_gridpack/work/processtmp/SubProcesses/P2_dxg_epemdxg/symfact.dat', '/afs/cern.ch/work/k/kplee/private/GridpackProduction/genproductions/bin/MadGraph5_aMCatNLO/dyellell012j_5f_NLO_FXFX_M3000/dyellell012j_5f_NLO_FXFX_M3000_gridpack/work/processtmp/SubProcesses/P2_dxg_epemdxg/iproc.dat', '/afs/cern.ch/work/k/kplee/private/GridpackProduction/genproductions/bin/MadGraph5_aMCatNLO/dyellell012j_5f_NLO_FXFX_M3000/dyellell012j_5f_NLO_FXFX_M3000_gridpack/work/processtmp/SubProcesses/P2_dxg_epemdxg/initial_states_map.dat', '/afs/cern.ch/work/k/kplee/private/GridpackProduction/genproductions/bin/MadGraph5_aMCatNLO/dyellell012j_5f_NLO_FXFX_M3000/dyellell012j_5f_NLO_FXFX_M3000_gridpack/work/processtmp/SubProcesses/P2_dxg_epemdxg/configs_and_props_info.dat', '/afs/cern.ch/work/k/kplee/private/GridpackProduction/genproductions/bin/MadGraph5_aMCatNLO/dyellell012j_5f_NLO_FXFX_M3000/dyellell012j_5f_NLO_FXFX_M3000_gridpack/work/processtmp/SubProcesses/P2_dxg_epemdxg/leshouche_info.dat', '/afs/cern.ch/work/k/kplee/private/GridpackProduction/genproductions/bin/MadGraph5_aMCatNLO/dyellell012j_5f_NLO_FXFX_M3000/dyellell012j_5f_NLO_FXFX_M3000_gridpack/work/processtmp/SubProcesses/P2_dxg_epemdxg/param_card.dat', '/afs/cern.ch/work/k/kplee/private/GridpackProduction/genproductions/bin/MadGraph5_aMCatNLO/dyellell012j_5f_NLO_FXFX_M3000/dyellell012j_5f_NLO_FXFX_M3000_gridpack/work/processtmp/SubProcesses/P2_dxg_epemdxg/FKS_params.dat', '/afs/cern.ch/work/k/kplee/private/GridpackProduction/genproductions/bin/MadGraph5_aMCatNLO/dyellell012j_5f_NLO_FXFX_M3000/dyellell012j_5f_NLO_FXFX_M3000_gridpack/work/processtmp/SubProcesses/P2_dxg_epemdxg/MadLoop5_resources.tar.gz', '/afs/cern.ch/work/k/kplee/private/GridpackProduction/genproductions/bin/MadGraph5_aMCatNLO/dyellell012j_5f_NLO_FXFX_M3000/dyellell012j_5f_NLO_FXFX_M3000_gridpack/work/processtmp/SubProcesses/P2_dxg_epemdxg/madevent_mintMC', '/afs/cern.ch/work/k/kplee/private/GridpackProduction/genproductions/bin/MadGraph5_aMCatNLO/dyellell012j_5f_NLO_FXFX_M3000/dyellell012j_5f_NLO_FXFX_M3000_gridpack/work/processtmp/SubProcesses/madinMMC_F.2', '/afs/cern.ch/work/k/kplee/private/GridpackProduction/genproductions/bin/MadGraph5_aMCatNLO/dyellell012j_5f_NLO_FXFX_M3000/dyellell012j_5f_NLO_FXFX_M3000_gridpack/work/processtmp/lib/PDFsets']}
file missing: /afs/cern.ch/work/k/kplee/private/GridpackProduction/genproductions/bin/MadGraph5_aMCatNLO/dyellell012j_5f_NLO_FXFX_M3000/dyellell012j_5f_NLO_FXFX_M3000_gridpack/work/processtmp/SubProcesses/P2_dxg_epemdxg/GF7/log_MINT0.txt
Fails 5 times
No resubmition. [0m
Hi Josh,
Yes, this is very likely to be the problem. I've seen the writing of the binary crap happening on AFS systems before. However, the code seems to have tried to run this 5 times, and it seems unlikely that it failed 5 times for this channel in the same way.
However, there is something else worrying as well. By looking at the results in the log file:
accumulated results ABS integral = 0.5385E-10 +/- 0.7298E-13 ( 0.136 %)
accumulated results Integral = 0.2442E-11 +/- 0.6376E-13 ( 2.611 %)
accumulated results Virtual = 0.1518E-11 +/- 0.1351E-12 ( 8.901 %)
accumulated results Virtual ratio = -.4504E+02 +/- 0.1745E-01 ( 0.039 %)
accumulated results ABS virtual = 0.1082E-09 +/- 0.1338E-12 ( 0.124 %)
accumulated results Born*ao2pi = 0.5401E-13 +/- 0.1457E-15 ( 0.270 %)
it seems to me that the contribution from the virtual corrections is large and that there are very large cancelations in that virtual. (The integral of the absolute value of the virtual corrections is twice as large as the integral of the absolute value of the total integrand). This really suggest that there is either a serious problem with the renormalisation scale used for this process (could be: it's set automatically in the FxFx scheme, which has not really been tested in such an extreme request on the invariant mass of the leptons), or a real problem with the stability of the virtual corrections. The latter might be a problem, given that you have some phase-space points that even with quadruple precision were still marked as unstable.
I'm not sure which of the above is the problem. But I think that both will go away if you use a much larger merging scale, and therefore you can also use a much harder generation cut on the light jet. Using a large merging scale is not strange: compared to a 3TeV invariant mass cut, a (several) hundred GeV jet can still be considered soft. Unfortunately, this means that you'll have to rely a lot on the shower to describe those soft-ish, which, as you know, lacks possible correlations between jets that are present in the matrix elements.
To really understand this problem would require quite a bit of investigation and might therefore take a couple of months.
best,
Rikkert