# How-to : Sample Request

## Sample Request Categorisation

1. **GEN** : Sample production with generator python3 fragments (Pythia, Sherpa, ...).
2. **wmLHEGEN** : Sample production with gridpacks (MadGraph, Powheg, ...; **wmLHE**) and hadronizer python3 fragments (Pythia, Herwig, ...; **GEN**).
3. **pLHEGEN** : Sample production with private LHE files (MadGraph, QBH, ...; **pLHE**) and hadronizer python3 fragments (Pythia, Herwig, ..., **GEN**).

{% hint style="warning" %}
For pLHEGEN requests, please provide us your private LHE files uploaded at lxplus. And we would also need some "good" reasons for this type of request.\
e.g. there is no gridpack generation workflow for QBH, we would like to use the newest release of MadGraph with special functionality, etc.
{% endhint %}

4\. **SIM** : The step right after GEN for detector simulation.

5\. **GS** : In Run3, **GEN** and **SIM** steps are combined into one "**GS**" step.

For more details, take a look at <https://cms-pdmv.gitbook.io/project/mccontact>.

![General flow-chart of MC request submission (courtesy of D. Sheffield)](/files/Qs6D6cUJe1pQtotxXbit)

## How to Request for Sample

### **Changes in Sample Request Procedures**

![EXO sample request workflow changes from 2021](/files/TfzqNMn1955UVlL7X7v2)

* Until 2020, EXO-MC\&I took sample request without local validation test results. Local validation test was done from the EXO MC contacts "after" prepid was created in McM. After local validation test was done, the test results were injected to McM and then McM validation was triggered.\\
* **From 2021, we ask the requestors to run the local validation test before making the sample request and reaching out to us.** This will steer up the sample request process by avoiding unnecessary discussions on obvious mistakes from requestor's side such as missing information in dataset name, generator fragment file not correctly created, corrupted gridpacks, etc.

{% hint style="info" %}
**RunII-prelegacy** : RunIISummer15, RunIIFall17, RunIIFall18 campaigns.\
**RunII-legacy** : RunIISummer20UL16APV, RunIISummer20UL16, RunIISummer20UL17, RunIISummer20UL18 campaigns. There are 2 campaigns for 2016 dataset (pre and post-VFP fixes).

**Run3** : **Run3Summer22, Run3Summer22EE, Run3Summer23, Run3Summer23BPix, RunIII2024Summer24**
{% endhint %}

{% hint style="info" %}
RunII-prelegacy is what we called "RunII-legacy", RunII-legacy is what we called "Ultra Legacy" until 2020. Keep in mind these terminology changes.
{% endhint %}

{% hint style="danger" %}
For sample request that needs special treatment on cmsDriver commands, please send EXO-MC\&I convenors an email in advance.
{% endhint %}

{% hint style="danger" %}
For sample requests that are not a RunII-legacy request (= Ultra Legacy), you don't need to run the local validation test. Make a pull request with the correct CSV format without local validation test results but with generator fragments will be fine (details below).
{% endhint %}

### Instructions

#### 1. Prepare a CSV file containing the following information with name **request.csv**.

1. **Dataset name** : The physics process, tune used for showering/hadronisation, generator used should be included.\
   \&#xNAN;*e.g.* DYJetsToLL\_M-50\_TuneCP5\_13TeV\_madgraphMLM-pythia8. Instructions on how to define the dataset names are detailed here: [Run 2](https://cms-pdmv.gitbook.io/project/mccontact/rules-for-dataset-names), [2022 and 2023](https://cms-pdmv.gitbook.io/project/mccontact/rules-for-run3-dataset-names), and [2024](https://cms-pdmv.gitbook.io/project/mccontact/rules-for-run3-2024-dataset-names).
2. **Total events** : Number of events to be produced after matching/filtering. *e.g.* If matching efficiency is 0.4, and you want 10000 events to be produced, do not write 10000 X 1/0.4 = 25000.\_
3. **Generator fragment name** : Generator fragment used for showering/hadronisation.\
   \&#xNAN;*e.g.* genFragments/Hadronizer/13TeV/Hadronizer\_TuneCP5\_13TeV\_MLM\_5f\_max4j\_qCut19\_LHE\_pythia8\_cff.py\_
4. **Generator** : Generator used for ME level calculations.\
   \&#xNAN;*e.g.* madgraph
5. **Gridpack location** : Path to gridpack location in lxplus public or CVMFS area if gridpack is used.\
   \&#xNAN;*e.g.* /cvmfs/cms.cern.ch/phys\_generator/gridpacks/UL/13TeV/madgraph/V5\_2.6.5/dyellell01234j\_5f\_LO\_MLM\_v2/DYJets\_HT-incl\_slc6\_amd64\_gcc630\_CMSSW\_9\_3\_16\_tarball.tar.xz \\

{% hint style="info" %}
If you are using gridpacks that are not currently stored in CVMFS, include their current paths in the CSV file. When submitting the merge request, we'll take care of copying the new gridpacks to CVFMS and let you know the new path so that the CSV can be updated.
{% endhint %}

One example of well written CSV file : [link](https://gitlab.cern.ch/cms-exo-mci/EXO-MCsampleRequests/-/blob/master/requestTickets/RunII-legacy/MET+X/20210128_dperezad_bbHToZa/request.csv).

#### 2. Run local validation test.

* Fork the **EXO-MCsampleRequests** gitlab repository [link](https://gitlab.cern.ch/cms-exo-mci/EXO-MCsampleRequests).
* Git clone your forked **EXO-MCsampleRequests** gitlab repository.

```bash
git clone ssh://git@gitlab.cern.ch:7999/<YourAccount>/EXO-MCsampleRequests.git
cd EXO-MCsampleRequests
```

{% hint style="danger" %}
If you have already forked the repository in the past, please checkout the official master branch before adding your changes.

```bash
git remote add official https://gitlab.cern.ch/cms-exo-mci/EXO-MCsampleRequests.git
git fetch official
git checkout official/master
```

{% endhint %}

* Move your **request.csv** to the git cloned area.

```bash
mv <PathToCSVFile> ./request.csv
```

* Run local validation test to measure size/time per event, matching/filtering efficiencies. This will first check whether your CSV file is following the McM naming convention rules, has all the needed information, and formatted correctly.\
  1\) TestDir : Name of the directory to run the local validation test.\
  2\) nEvents : How many events to run the local validation test. If you have filters or jet matching/merging in your generator fragments, set it to a number around 5000. If not, 1000 events will be enough for most of the cases.

  3\) Run: The era used for local test: "**run2\_ul**", "**run3**" (2022 and 2023), or "**2024**", "**run2\_ul**" is the default value if "-r" option is not provided, so for Run3 request, "**-r run3**" or "**-r 2024**" are *necessary*.
* If everything is fine, it will build cmsDriver commands and condor configuration files giving you **submit\_condor\_\<TestDir>.sh** to submit the condor jobs.

```bash
python3 manageRequests.py --localtest -D <TestDir> -n <nEvents> (-r <Run>) request.csv
source submit_condor_<TestDir>.sh

# an example for Run-3
python3 manageRequests.py --localtest -D TestDir -n 1000 -r run3 request.csv
source submit_condor_TestDir.sh
```

* After the submitted jobs are finished, fetch the local validation test results.

```bash
python3 manageRequests.py --localtest -f <TestDir>/input.csv
```

* If jobs ran fine, it will give you **test\_results.csv** which should be used to make the request tickets.

```bash
python3 makeTicket.py -f <TestDir>/test_results.csv
```

{% hint style="danger" %}
**If your sample request is for RunII-prelegacy you don't need to run the local validations.**

```bash
python3 makeTicket.py -f <TestDir>/input.csv
```

{% endhint %}

* A new directory in **requestTickets** directory named **YYYYMMDD\_CERNID\_SampleInfo** will be created.\
  1\) YYYYMMDD : Date when the sample request is made, in year-month-date order.\
  2\) CERNID : CERN account.\
  3\) SampleInfo : Brief information of the sample.\
  \&#xNAN;*e.g. 20210127\_shjeon\_DYJetsToLL*\\
* Questions below will be asked (some might not, depending on the campaigns).\
  1\) Which EXO subgroup?\
  2\) Which campaign?\
  3\) Which year?\
  4\) \[RunII-legacy] nEvents for 2016 and 2016APV\
  5\) \[RunII-prelegacy, RunII-legacy] Need for higher priority\
  5\) \[RunII-prelegacy] Need for AODSIM files

#### 3. Make a pull request to central gitlab [link](https://gitlab.cern.ch/cms-exo-mci/EXO-MCsampleRequests) with the following items committed.

* Your directory in requestTickets **YYYYMMDD\_CERNID\_SampleInfo** that the CSV file "request.csv" and the adoc file "request.adoc" are in.
* Your generator fragment if it is not already in the [link](https://gitlab.cern.ch/cms-exo-mci/EXO-MCsampleRequests/-/tree/master/genFragments).
* The lines that are printed out after executing the "makeTicket.py".

```bash
### Lines like below will be printed out for you.
### Do not forget to attach this when you make the pull request in gitlab.

Sample request information from shjeon
Tag : 20210713_shjeon_TEST
Subgroup : MET+X
Campaign : RunII-legacy
Year : 2016,2016APV,2017,2018
Events (per year/campaign) : 210000
[FLAG] Do not divide 2016 events into 2
[FLAG] High priority is needed

TEST,MET+X,20210713_shjeon_TEST,RunII-legacy,shjeon,-,20210713,-
```

```bash
git checkout -b <YourBranchForRequest> # e.g. DYJetsToLL_mass_binned
git add <PathToYourTicketIn 'requestTickets directory'>
git add <PathToYourGenFragmentsIn 'genFragments directory'>
git commit -m "<BriefSampleInfo>" # e.g. DYJetsToLL mass binned
git push origin <YourBranchForRequest>

### You can now make a pull request with your branch!
```

{% hint style="danger" %}
Do not make a pull request with your \<TestDir> which was used to do the local validations.
{% endhint %}

{% hint style="warning" %}
\[RunII-legacy] Number of events for 2016 and 2016APV will be half of those for 2017 and 2018 if not noted ("N"). If you want the number of events to be equal for all 4 campaigns ("Y"), please send EXO-MC\&I convenors an email for confirmation.

```
 Y : nEvents(2016) = nEvents(2016APV) = nEvents(2017) = nEvents(2018).
 N : nEvents(2016) = nEvents(2016APV) = nEvents(2017)/2 = nEvents(2018)/2. (recommended)
```

{% endhint %}

{% hint style="warning" %}
\[RunII-prelegacy, RunII-legacy] High priority should be discussed with the relevant convenors in advance. Please send EXO-MC\&I convenors an email for confirmation.
{% endhint %}

{% hint style="warning" %}
\[RunII-prelegacy] AODSIM files are not stored if not noted. Please send EXO-MC\&I convenors an email for confirmation.\
\[RunII-legacy] AODSIM files are kept as default.
{% endhint %}

**If anything is unclear, please contact us through email before making a merge request.**


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://exo-mc-and-i.gitbook.io/exo-mc-and-interpretation/how-to-sample-request.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
