How-to : Sample Request

How to request for official samples produced centrally.

Sample Request Categorisation

GEN : Sample production with generator python3 fragments (Pythia, Sherpa, ...).
wmLHEGEN : Sample production with gridpacks (MadGraph, Powheg, ...; wmLHE) and hadronizer python3 fragments (Pythia, Herwig, ...; GEN).
pLHEGEN : Sample production with private LHE files (MadGraph, QBH, ...; pLHE) and hadronizer python3 fragments (Pythia, Herwig, ..., GEN).

For pLHEGEN requests, please provide us your private LHE files uploaded at lxplus. And we would also need some "good" reasons for this type of request. e.g. there is no gridpack generation workflow for QBH, we would like to use the newest release of MadGraph with special functionality, etc.

4. SIM : The step right after GEN for detector simulation.

5. GS : In Run3, GEN and SIM steps are combined into one "GS" step.

For more details, take a look at https://cms-pdmv.gitbook.io/project/mccontact.

How to Request for Sample

Changes in Sample Request Procedures

Until 2020, EXO-MC&I took sample request without local validation test results. Local validation test was done from the EXO MC contacts "after" prepid was created in McM. After local validation test was done, the test results were injected to McM and then McM validation was triggered.\
From 2021, we ask the requestors to run the local validation test before making the sample request and reaching out to us. This will steer up the sample request process by avoiding unnecessary discussions on obvious mistakes from requestor's side such as missing information in dataset name, generator fragment file not correctly created, corrupted gridpacks, etc.

RunII-prelegacy : RunIISummer15, RunIIFall17, RunIIFall18 campaigns. RunII-legacy : RunIISummer20UL16APV, RunIISummer20UL16, RunIISummer20UL17, RunIISummer20UL18 campaigns. There are 2 campaigns for 2016 dataset (pre and post-VFP fixes).

Run3 : Run3Summer22, Run3Summer22EE, Run3Summer23, Run3Summer23BPix, RunIII2024Summer24

RunII-prelegacy is what we called "RunII-legacy", RunII-legacy is what we called "Ultra Legacy" until 2020. Keep in mind these terminology changes.

For sample request that needs special treatment on cmsDriver commands, please send EXO-MC&I convenors an email in advance.

For sample requests that are not a RunII-legacy request (= Ultra Legacy), you don't need to run the local validation test. Make a pull request with the correct CSV format without local validation test results but with generator fragments will be fine (details below).

Instructions

1. Prepare a CSV file containing the following information with name request.csv.

Dataset name : The physics process, tune used for showering/hadronisation, generator used should be included. e.g. DYJetsToLL_M-50_TuneCP5_13TeV_madgraphMLM-pythia8. Instructions on how to define the dataset names are detailed here: Run 2, 2022 and 2023, and 2024.
Total events : Number of events to be produced after matching/filtering.e.g. If matching efficiency is 0.4, and you want 10000 events to be produced, do not write 10000 X 1/0.4 = 25000._
Generator fragment name : Generator fragment used for showering/hadronisation. e.g. genFragments/Hadronizer/13TeV/Hadronizer_TuneCP5_13TeV_MLM_5f_max4j_qCut19_LHE_pythia8_cff.py_
Generator : Generator used for ME level calculations. e.g. madgraph
Gridpack location : Path to gridpack location in lxplus public or CVMFS area if gridpack is used. e.g. /cvmfs/cms.cern.ch/phys_generator/gridpacks/UL/13TeV/madgraph/V5_2.6.5/dyellell01234j_5f_LO_MLM_v2/DYJets_HT-incl_slc6_amd64_gcc630_CMSSW_9_3_16_tarball.tar.xz \

If you are using gridpacks that are not currently stored in CVMFS, include their current paths in the CSV file. When submitting the merge request, we'll take care of copying the new gridpacks to CVFMS and let you know the new path so that the CSV can be updated.

One example of well written CSV file : link.

2. Run local validation test.

Fork the EXO-MCsampleRequests gitlab repository link.
Git clone your forked EXO-MCsampleRequests gitlab repository.

git clone ssh://git@gitlab.cern.ch:7999/<YourAccount>/EXO-MCsampleRequests.git
cd EXO-MCsampleRequests

If you have already forked the repository in the past, please checkout the official master branch before adding your changes.

git remote add official https://gitlab.cern.ch/cms-exo-mci/EXO-MCsampleRequests.git
git fetch official
git checkout official/master

Move your request.csv to the git cloned area.

mv <PathToCSVFile> ./request.csv

Run local validation test to measure size/time per event, matching/filtering efficiencies. This will first check whether your CSV file is following the McM naming convention rules, has all the needed information, and formatted correctly. 1) TestDir : Name of the directory to run the local validation test. 2) nEvents : How many events to run the local validation test. If you have filters or jet matching/merging in your generator fragments, set it to a number around 5000. If not, 1000 events will be enough for most of the cases.
3) Run: The era used for local test: "run2_ul", "run3" (2022 and 2023), or "2024", "run2_ul" is the default value if "-r" option is not provided, so for Run3 request, "-r run3" or "-r 2024" are necessary.
If everything is fine, it will build cmsDriver commands and condor configuration files giving you submit_condor_<TestDir>.sh to submit the condor jobs.

python3 manageRequests.py --localtest -D <TestDir> -n <nEvents> (-r <Run>) request.csv
source submit_condor_<TestDir>.sh

# an example for Run-3
python3 manageRequests.py --localtest -D TestDir -n 1000 -r run3 request.csv
source submit_condor_TestDir.sh

After the submitted jobs are finished, fetch the local validation test results.

python3 manageRequests.py --localtest -f <TestDir>/input.csv

If jobs ran fine, it will give you test_results.csv which should be used to make the request tickets.

python3 makeTicket.py -f <TestDir>/test_results.csv

If your sample request is for RunII-prelegacy you don't need to run the local validations.

python3 makeTicket.py -f <TestDir>/input.csv

A new directory in requestTickets directory named YYYYMMDD_CERNID_SampleInfo will be created. 1) YYYYMMDD : Date when the sample request is made, in year-month-date order. 2) CERNID : CERN account. 3) SampleInfo : Brief information of the sample. e.g. 20210127_shjeon_DYJetsToLL\
Questions below will be asked (some might not, depending on the campaigns). 1) Which EXO subgroup? 2) Which campaign? 3) Which year? 4) [RunII-legacy] nEvents for 2016 and 2016APV 5) [RunII-prelegacy, RunII-legacy] Need for higher priority 5) [RunII-prelegacy] Need for AODSIM files

3. Make a pull request to central gitlab link with the following items committed.

Your directory in requestTickets YYYYMMDD_CERNID_SampleInfo that the CSV file "request.csv" and the adoc file "request.adoc" are in.
Your generator fragment if it is not already in the link.
The lines that are printed out after executing the "makeTicket.py".

### Lines like below will be printed out for you.
### Do not forget to attach this when you make the pull request in gitlab.

Sample request information from shjeon
Tag : 20210713_shjeon_TEST
Subgroup : MET+X
Campaign : RunII-legacy
Year : 2016,2016APV,2017,2018
Events (per year/campaign) : 210000
[FLAG] Do not divide 2016 events into 2
[FLAG] High priority is needed

TEST,MET+X,20210713_shjeon_TEST,RunII-legacy,shjeon,-,20210713,-

git checkout -b <YourBranchForRequest> # e.g. DYJetsToLL_mass_binned
git add <PathToYourTicketIn 'requestTickets directory'>
git add <PathToYourGenFragmentsIn 'genFragments directory'>
git commit -m "<BriefSampleInfo>" # e.g. DYJetsToLL mass binned
git push origin <YourBranchForRequest>

### You can now make a pull request with your branch!

Do not make a pull request with your <TestDir> which was used to do the local validations.

[RunII-legacy] Number of events for 2016 and 2016APV will be half of those for 2017 and 2018 if not noted ("N"). If you want the number of events to be equal for all 4 campaigns ("Y"), please send EXO-MC&I convenors an email for confirmation.

 Y : nEvents(2016) = nEvents(2016APV) = nEvents(2017) = nEvents(2018).
 N : nEvents(2016) = nEvents(2016APV) = nEvents(2017)/2 = nEvents(2018)/2. (recommended)

[RunII-prelegacy, RunII-legacy] High priority should be discussed with the relevant convenors in advance. Please send EXO-MC&I convenors an email for confirmation.

[RunII-prelegacy] AODSIM files are not stored if not noted. Please send EXO-MC&I convenors an email for confirmation. [RunII-legacy] AODSIM files are kept as default.

If anything is unclear, please contact us through email before making a merge request.

PreviousEXO MC&Interpretation NextHow to : Request Prelegacy Samples Cloned to Legacy

Last updated 2 months ago

Was this helpful?