How-to : Sample Request

How to request for official samples produced centrally.

Sample Request Categorisation

  1. GEN : Sample production with generator python3 fragments (Pythia, Sherpa, ...).

  2. wmLHEGEN : Sample production with gridpacks (MadGraph, Powheg, ...; wmLHE) and hadronizer python3 fragments (Pythia, Herwig, ...; GEN).

  3. pLHEGEN : Sample production with private LHE files (MadGraph, QBH, ...; pLHE) and hadronizer python3 fragments (Pythia, Herwig, ..., GEN).

4. SIM : The step right after GEN for detector simulation.

5. GS : In Run3, GEN and SIM steps are combined into one "GS" step.

For more details, take a look at https://cms-pdmv.gitbook.io/project/mccontact.

General flow-chart of MC request submission (courtesy of D. Sheffield)

How to Request for Sample

Changes in Sample Request Procedures

EXO sample request workflow changes from 2021
  • Until 2020, EXO-MC&I took sample request without local validation test results. Local validation test was done from the EXO MC contacts "after" prepid was created in McM. After local validation test was done, the test results were injected to McM and then McM validation was triggered.\

  • From 2021, we ask the requestors to run the local validation test before making the sample request and reaching out to us. This will steer up the sample request process by avoiding unnecessary discussions on obvious mistakes from requestor's side such as missing information in dataset name, generator fragment file not correctly created, corrupted gridpacks, etc.

RunII-prelegacy : RunIISummer15, RunIIFall17, RunIIFall18 campaigns. RunII-legacy : RunIISummer20UL16APV, RunIISummer20UL16, RunIISummer20UL17, RunIISummer20UL18 campaigns. There are 2 campaigns for 2016 dataset (pre and post-VFP fixes).

Run3 : Run3Summer22, Run3Summer22EE, Run3Summer23, Run3Summer23BPix, RunIII2024Summer24

RunII-prelegacy is what we called "RunII-legacy", RunII-legacy is what we called "Ultra Legacy" until 2020. Keep in mind these terminology changes.

Instructions

1. Prepare a CSV file containing the following information with name request.csv.

  1. Dataset name : The physics process, tune used for showering/hadronisation, generator used should be included. e.g. DYJetsToLL_M-50_TuneCP5_13TeV_madgraphMLM-pythia8. Instructions on how to define the dataset names are detailed here: Run 2, 2022 and 2023, and 2024.

  2. Total events : Number of events to be produced after matching/filtering.e.g. If matching efficiency is 0.4, and you want 10000 events to be produced, do not write 10000 X 1/0.4 = 25000._

  3. Generator fragment name : Generator fragment used for showering/hadronisation. e.g. genFragments/Hadronizer/13TeV/Hadronizer_TuneCP5_13TeV_MLM_5f_max4j_qCut19_LHE_pythia8_cff.py_

  4. Generator : Generator used for ME level calculations. e.g. madgraph

  5. Gridpack location : Path to gridpack location in lxplus public or CVMFS area if gridpack is used. e.g. /cvmfs/cms.cern.ch/phys_generator/gridpacks/UL/13TeV/madgraph/V5_2.6.5/dyellell01234j_5f_LO_MLM_v2/DYJets_HT-incl_slc6_amd64_gcc630_CMSSW_9_3_16_tarball.tar.xz \

If you are using gridpacks that are not currently stored in CVMFS, include their current paths in the CSV file. When submitting the merge request, we'll take care of copying the new gridpacks to CVFMS and let you know the new path so that the CSV can be updated.

One example of well written CSV file : link.

2. Run local validation test.

  • Fork the EXO-MCsampleRequests gitlab repository link.

  • Git clone your forked EXO-MCsampleRequests gitlab repository.

git clone ssh://git@gitlab.cern.ch:7999/<YourAccount>/EXO-MCsampleRequests.git
cd EXO-MCsampleRequests
  • Move your request.csv to the git cloned area.

mv <PathToCSVFile> ./request.csv
  • Run local validation test to measure size/time per event, matching/filtering efficiencies. This will first check whether your CSV file is following the McM naming convention rules, has all the needed information, and formatted correctly. 1) TestDir : Name of the directory to run the local validation test. 2) nEvents : How many events to run the local validation test. If you have filters or jet matching/merging in your generator fragments, set it to a number around 5000. If not, 1000 events will be enough for most of the cases.

    3) Run: The era used for local test: "run2_ul", "run3" (2022 and 2023), or "2024", "run2_ul" is the default value if "-r" option is not provided, so for Run3 request, "-r run3" or "-r 2024" are necessary.

  • If everything is fine, it will build cmsDriver commands and condor configuration files giving you submit_condor_<TestDir>.sh to submit the condor jobs.

python3 manageRequests.py --localtest -D <TestDir> -n <nEvents> (-r <Run>) request.csv
source submit_condor_<TestDir>.sh

# an example for Run-3
python3 manageRequests.py --localtest -D TestDir -n 1000 -r run3 request.csv
source submit_condor_TestDir.sh
  • After the submitted jobs are finished, fetch the local validation test results.

python3 manageRequests.py --localtest -f <TestDir>/input.csv
  • If jobs ran fine, it will give you test_results.csv which should be used to make the request tickets.

python3 makeTicket.py -f <TestDir>/test_results.csv
  • A new directory in requestTickets directory named YYYYMMDD_CERNID_SampleInfo will be created. 1) YYYYMMDD : Date when the sample request is made, in year-month-date order. 2) CERNID : CERN account. 3) SampleInfo : Brief information of the sample. e.g. 20210127_shjeon_DYJetsToLL\

  • Questions below will be asked (some might not, depending on the campaigns). 1) Which EXO subgroup? 2) Which campaign? 3) Which year? 4) [RunII-legacy] nEvents for 2016 and 2016APV 5) [RunII-prelegacy, RunII-legacy] Need for higher priority 5) [RunII-prelegacy] Need for AODSIM files

  • Your directory in requestTickets YYYYMMDD_CERNID_SampleInfo that the CSV file "request.csv" and the adoc file "request.adoc" are in.

  • Your generator fragment if it is not already in the link.

  • The lines that are printed out after executing the "makeTicket.py".

### Lines like below will be printed out for you.
### Do not forget to attach this when you make the pull request in gitlab.

Sample request information from shjeon
Tag : 20210713_shjeon_TEST
Subgroup : MET+X
Campaign : RunII-legacy
Year : 2016,2016APV,2017,2018
Events (per year/campaign) : 210000
[FLAG] Do not divide 2016 events into 2
[FLAG] High priority is needed

TEST,MET+X,20210713_shjeon_TEST,RunII-legacy,shjeon,-,20210713,-
git checkout -b <YourBranchForRequest> # e.g. DYJetsToLL_mass_binned
git add <PathToYourTicketIn 'requestTickets directory'>
git add <PathToYourGenFragmentsIn 'genFragments directory'>
git commit -m "<BriefSampleInfo>" # e.g. DYJetsToLL mass binned
git push origin <YourBranchForRequest>

### You can now make a pull request with your branch!

If anything is unclear, please contact us through email before making a merge request.

Last updated

Was this helpful?