How-to : HEPData Preparation
Last updated
Last updated
The Durham High-Energy Physics Database (HEPData) has been built up over the past four decades as a unique open-access repository for scattering data from experimental particle physics. It currently comprises the data points from plots and tables related to several thousand publications including those from the Large Hadron Collider (LHC). HEPData is funded by a grant from the UK STFC and is based at the IPPP at Durham University. From HEPData webpage
HepData team in EXO is mainly the EXO HepData contact, Gianfranco De Castro Gianfranco.De.Castro@cern.ch. egroup address (including the MC&I conveners): cms-phys-exo-hepdata@cern.ch.
Analysts reach out to EXO HepData team to create an entry (Suggested upon Approval~CWR)
EXO HepData team will create an HepData entry for analysts to submit the materials for review.
Once reviewed, the HepData entry gets blessing from HepData Team.
The record is then expected to come out ~ ArXiv submission / INSPIRE entry creation.
Talk on reinterpretations in CMS EXO Workshop 2021 link
Talk on HEPData preparations in CMS week in April 2022 link
Talk on HEPData preparations in EXO General Meeting in June 2022 link
Talk on HEPData at the EXO Welcoming Meeting in October 2024 link
Cutflow tables Signal process's cut efficiencies after each signal region event selection cuts. This is the minimal requirement for reinterpretation, thus other inputs are mostly useless if this is not shared. For most of the cases, reinterpretators usually do not have much computing powers. As GEANT4 is computationally time consuming they often use fast simulation (DELPHES, RIVET, MADANALYSIS, etc.) to recast actual data analyses. Thus, cutflow tables are materials that they necessarily require in order to validate signal samples used for recasting. example1 : Take a look at "Cut-flow table mN=0.2TeV, boosted, e channel, 2016" and others \
Covariance matrices and event yield tables Covariance matrices, showing how event entries migrate to different bins and signal regions, can be helpful in case when full likelihoods cannot be provided. Based on Gaussian distribution assumptions, it can be used to construct simplified likelihoods. example1 : CMS notes on simplified likelihoods example2 : Take a look at "Background Covariance Matrix (e channel)" and others \
Additional materials Analysis specific variables that could be parameterized. It's often difficult for fast simulation tools to fully reproudce the variables used for the analysis such as ML or DNN taggers to AK8 jet, lepton-subjet-fraction (LSF), etc. At least fast simulation can try to mimic the behavior of the variables as a function of pT, eta, etc by parameterizing it. example1 : MADANALYSIS SFS card for CMS detector \
Recasting codes Reinterpretators first need to recast the existing analysis and validate their recasting codes using HEPData material that we provide. And from this recasting work, reinterpretations on many different BSM models become available. Recasting workflow can be summarized as below : 1) Read our paper 2) Write the recasting codes as it says in the paper 3) Generate sample for signal events 4) Run the recasting codes over generated signal samples 5) Compare the cutflow, HEPData that we provided and what recasting codes give It's obviously more easier for us to write the codes and validate the cutflow. We can consider providing such works as it will make the analysis much more tempting to be used for reinterpretation purposes. example1 : MADANALYSIS recasting of EXO-20-004 by A. Albert