How-to : Sample Production

How to produce your private samples

RunII-Legacy (Ultra Legacy) Campaign Steps

  1. (wmLHE)GEN : Produces events in a generator level (ME, PS) with the MC generators.\

    • GEN : Sample production with generator python fragments (Pythia, Sherpa, ...).\

    • wmLHEGEN : Sample production with gridpacks (MadGraph, Powheg, ...) and hadronizer python fragments (Pythia, Herwig, ...).

  2. SIM : Simulates the energy released by the particles in the crossed detectors. Beamspot, detector geometries are considered as the parameters. \

  3. DIGIPremix : The simulated detector signals are digitised and the prepared pileup events are overlayed/premixed to the samples.\

  4. HLT : Regional reconstructions are done by the HLT algorithms and makes trigger decisions using the HLT level objects.\

  5. RECO : Physics objects (e.g. muons, jets, etc.) are reconstructed. The output format of this step is generally in AODSIM format. \

  6. MiniAOD : The RECO level events are skimmed and reduced in size by running the MiniAOD module, which saves information of physics objects which can be directly used for analyses.\

  7. NanoAOD : The MiniAOD level events are further skimmed and reduced in size by running the NanoAOD module. The idea of NanoAOD format is to have a plain ROOT sample which can be analysed outside any CMSSW environment.

How to Produce Sample

Please send us an email (cms-exo-mci@cern.ch) if you happen to use this, we would like to know how helpful this can be. And hopefully make updates for RunIII if it does indeed help.

Automised codes that helps analysers to produce their RunII-Legacy (Ultra-Legacy) samples privately using CRAB job submissions.

Instructions

1. Git clone the EXO-MCsampleProductions gitlab repository link. 2. Prepare a CSV file with inputs in following order to run (wmLHE)GEN step (split with commas) :

  1. DATASETNAME : Name of the dataset that will be used for DAS publication.\

  2. GENFRAGMENT : Generator fragment python file that will be used for hadronizer or generator. It should be stored in skeleton/genfragments directory.\

  3. NEVENTS : Number of events in total that will be made. Jet matching or filter efficiencies should be taken into account. e.g. If matching efficiency is 0.4, and you want 10000 events to be produced, write 10000 X 1/0.4 = 25000.\

  4. NSPLITJOBS : Number of jobs that will be split into CRAB. e.g. NEVENTS=25000 with NSPLITJOBS=25 will run 1000 events per 1 CRAB job.\

  5. GRIDPACK : Path to gridpack if gridpack is used. It should be in CVMFS area.

3. Execute the setup.py file to build CMSSW releases and choose the Tier2/3 sites to store your samples produced through CRAB jobs.

You should check you have the writing permissions in the Tier2/3 sites you chose by executing the command below :

4. Move to the <Simulation>/<Campaign>/(wmLHE)GEN__<CMSSW> directory and execute the config_(wmLHE)GEN.py file to build the cmsDriver commands and CRAB configuration files.

5. When everything goes well, submit_crab_<CSV name>.sh will be given.

6. Once the CRAB jobs for the (wmLHE)GEN step are finished, prepare a CSV file with inputs in following order (split with commas) :

  1. DATASETNAME : Name of the dataset that will be used for DAS publication.\

  2. OUTPUTDATASET : Published DAS dataset path from the previous step. You can get this by executing CRAB status command.

7. The rest of the steps, from SIM to NanoAOD are just the same, CSV file with dataset name and the path to the published DAS dataset from the previous step should be given.

If anything is unclear, please contact Sihyun Jeon (shjeon@cern.ch, Skype : sihyun_jeon) before starting the production.

Last updated

Was this helpful?