EXO MC&Interpretation
  • EXO MC&Interpretation
  • How-to : Sample Request
    • How to : Request Prelegacy Samples Cloned to Legacy
  • How-to : Sample Production
    • Avoiding wmLHEGEN Step
  • How-to : Gridpack Production
  • How-to : HEPData Preparation
  • Others
    • Finding PrepIds in McM
    • Matching/merging using madgraphMLM
Powered by GitBook
On this page
  • HepData
  • The HepData Team in EXO
  • Submitting a HepData Entry
  • Required HepData Materials
  • Optional HepData Materials
  • Full Statistical Models
  • The Full Statistical Model Team in EXO
  • Submitting a Full Statistical Model - Setting up Datacards
  • Submitting a Full Statistical Model - Setting up Tutorials and Additional Materials
  • Submitting a Full Statistical Model - Publishing
  • Relevant Talks
  • Context on Additional Reinterpretation Materials

Was this helpful?

How-to : HEPData Preparation

PreviousHow-to : Gridpack ProductionNextOthers

Last updated 2 months ago

Was this helpful?

Reinterpretation materials are required of each analysis in EXO. These can be classified into:

  • HepData Sandbox - Signal generation fragments, cutflows, yields in paper plots, etc..

  • Full Statistical Models - Datacards, ROOT files, instructions to set-up and run Combine, etc..

Although these materials are submitted for review on similar timelines, their preparation and approval are slightly different.

HepData

: The Durham High-Energy Physics Database (HEPData) has been built up over the past four decades as a unique open-access repository for scattering data from experimental particle physics. It currently comprises the data points from plots and tables related to several thousand publications including those from the Large Hadron Collider (LHC). HEPData is funded by a grant from the UK STFC and is based at the IPPP at Durham University.

The HepData Team in EXO

The HepData team in EXO is mainly the EXO HepData contact: Gianfranco de Castro .Egroup address (including the MC&I conveners): .

Submitting a HepData Entry

  1. Analyzers reach out to the EXO HepData contact to create an entry around the time of their Approval talk and CWR.

    • Please do this with anticipation to avoid any delays.

  2. The EXO HepData contact will create a HepData entry for analyzers to submit the materials for review.

    • Please submit materials to the entry created by the HepData team and not your own personal entry.

  3. Upon review, one of two things happen:

    • ✅ The HepData entry gets the green light if all of the materials below are included.

    • ❌ Feedback will be given to analyzers to include missing materials or improve the entry.

  4. After the green light is given, the record is expected to come out ~ ArXiv submission / INSPIRE entry creation.

    • The HepData link should be added to the paper draft before submission. It is made public after the paper is public.

Required HepData Materials

Please include all of these materials in the entry.

  1. MC Generator inputs

    • Signal Gridpacks

    • Corresponding cross sections and k-factors

  2. Cutflow tables

    For each signal model used in the analysis, please include the relative and absolute selection efficiencies after each region's event level cuts. Please provide either the statistical uncertainty associated to each efficiency or the gen-level yields for each signal samples.

    This is a minimal requirement for reinterpretation; other inputs lose their value if this is not shared.

    Scientists external to CMS reinterpreting our results usually do not have much computing power. As GEANT4 is computationally time consuming, they often use fast reconstruction (such as Delphes) and public analysis frameworks (such as MadAnalysis) to reinterpret our results. As a result, cutflow tables are necessary in order to validate their processing of signal samples (generation, reconstruction, and selection).

    Note on NNs: The process of making neural network architectures/weights public is still being discussed internally by CMS higher-ups. In the meantime, providing the efficiency of a NN selection on the signal should still be included in these cutflow tables.

  3. Information from Relevant Figures

    Numbers shown in plots from the paper should be included in the HepData entry if they contain:

    • Upper limits on the analysis' figure of merit

    • Background, signal and data yields in regions used to derive these ULs

    • Other relevant figures used for reinterpretation

    Additional information from other figures can be included at the author's discretion but is not necessary.

Optional HepData Materials

  1. Covariance matrices

    Covariance matrices, which show how event yields migrate to different bins and regions. Now that Combine is public, we are moving away from encouraging these materials. However, in cases where the full likelihood cannot be provided, this is essential for estimating the simplified likelihood.

Full Statistical Models

There is a central CMS effort to publish full statistical models now that Combine is public. This includes publishing the datacards and ROOT files required to produce the main results of the analysis, along with a short tutorial on how to run the necessary fits with Combine.

The Full Statistical Model Team in EXO

Submitting a Full Statistical Model - Setting up Datacards

    • Running checks on the datacards/ROOT files using the built-in CI

  1. Reach out to the Combine contact for either:

    • ✅ The repository gets the green light if everything looks good.

    • ❌ Feedback will be given to analyzers if necessary.

Submitting a Full Statistical Model - Setting up Tutorials and Additional Materials

  1. Populate the ReadMe in the repository approved by the Combine contact with instructions on:

    • Setting up and running Combine software, including commands (or helper scripts) needed to reproduce results quoted in the paper.

    • Optional: Tutorials on setting up and running any additional reintepretation materials. This includes software for sample generation (ex: Pythia), public reconstruction software (ex: Delphes), and public analyzer software (ex: MadAnalysis)

  2. Reach out to the HepData contact for either:

    • ✅ The material get the green light if everything looks good.

    • ❌ Feedback will be given to analyzers if necessary.

Submitting a Full Statistical Model - Publishing

Before the analysis is public, CAT can show you a preview of what the public analysis page will look like. Once your paper is made public, let them know and they can make your analysis' page public.

Relevant Talks

Context on Additional Reinterpretation Materials

  1. Sample Generation

    Providing the Pythia fragments to generate samples used in the analysis is important for scientists external to our collaboration to be able to reinterpret our results. After samples are generated, passed through fast reconstruction, and have analysis-level selections applied, we can validate the entire process by cross-checking with the cutflow efficiencies provided in HepData.

  2. Reconstruction

    It's often difficult for reconstruction tools (such as Delphes) to fully reproduce the shape and normalization of analysis variables. Reconstruction software can try to mimic the behavior of the detector by parametrizing its effects in terms of variables such as kinematics (pT, eta, etc).

    Central CMS efforts are underway to optimize these cards for detector conditions in RunII, RunIII, and Phase II.

  3. Analysis Code

    Analyzers also have to apply event-level and object selections on the reconstructed events. There exist public frameworks, such as MadAnalysis, which give analyzers the tools to apply these selections. After doing so, analyzers can validate that their workflow correctly replicates results from the paper in question by cross-checking the cutflow efficiencies reported in HepData with those derived from the generation, reconstruction, analysis selection workflow.

    An example of an analysis doing this is the MadAnalysis materials created for EXO-20-004 by A. Albert. No Delphes card tuning was needed for this analysis, but they produced MadAnalysis code for scientists external to CMS to be able to replicate and reinterpret paper results with their own signal models.

: Take a look at "Signal Efficiency CSC Category" and "Signal Efficiency DT Category"

on simplified likelihoods and how they use the covariance matrix in this process.

for generating covariance matrices in your analysis

: Take a look at "Background Covariance Matrix (e channel)"

The team for publishing the full statistical models in EXO includes the EXO HepData and Combine contacts. HepData: Gianfranco de Castro . Combine: Cesare Tiziano Cazzaniga .

Reach out to the Combine contact to make an entry for your analysis using its CADI around the time of your Approval talk and CWR.

Follow to set-up your analysis' repository. Any datacards and ROOT files required to exactly reproduce results in the paper must be included.

Renaming systematics to the common style ()

Reach out to CAT saying your repository is ready to publish. A new page on will be created for your analysis.

The datacards, ROOT files, and any helper scripts from the CADI's entry will be compressed into a tar-ball included in your analysis' page

The ReadMe from the CADI's entry will be displayed in Markdown formatting.

Talk on publishing full statistical models at the EXO General meeting during March 2024

Talk on HEPData at the EXO Welcoming Meeting during October 2024

Talk on HEPData preparations at the EXO General Meeting during June 2022

Talk on HEPData preparation at the CMS week during April 2022

Talk on reinterpretation at the CMS EXO Workshop 2021

An example of this can be seen in the . Although optimized for RunI, the parametrization is a good rough estimate of the effects of different parts of the detector on the reconstruction of objects such as leptons and jets. Analyzers are encouraged to optimize/modify theses Delphes cards to meet the needs of their analysis for scientists external to CMS to be able to accurately reinterpret our results.

From HEPData webpage
gdecastr@bu.edu
cms-phys-exo-hepdata@cern.ch
CMS Note
Combine Tutorial
gdecastr@bu.edu
cesare.cazzaniga@cern.ch
here
this tutorial
more here
repository.cern.ch
here
here
link
link
link
link
link
nominal Delphes card
Link to EXO-24-004 MadAnalysis
Example
Example