In this tutorial you will learn how to use the common key4hep
tools for fast, parametrized detector simulation with Delphes
. We will use the FCC-hh baseline detector scenario and run on Higgs pair production events. We assume that the actual event generation step is already done, and so we will start from existing Les Houches Event (LHE) files. We will then use Pythia
for the hadronization & particle decays, and Delphes
for the detector simulation, resulting in EDM4HEP
“reco”-level events.
You will get an overview of how the parametrized detector response simulation works, as well as of the EDM4HEP
event data model.
In the second part of the tutorial, we will make some simple plots from the EDM4HEP
we produced.
Fast simulation with k4SimDelphes starting from existing LHE files
Check if you have setup the software stack and the DelphesPythia8_EDM4HEP
executable is available, by running which DelphesPythia8_EDM4HEP
. If this doesn’t return a path like /cvmfs/sw.hsf.org/key4hep/<somewhere>/DelphesPythia8_EDM4HEP
please follow the instructions here (again).
For the usecase of running Delphes fast simulation to produce EDM4HEP
files, we only need this executable. But you can check which other Delphes utilities are available in your installation by just using the autocomplete functionality, i.e. typing Delphes
We can check how to actually run the exectuable in question with the help option, for example:
DelphesPythia8_EDM4HEP -h
returns
Usage: DelphesPythia8config_file output_config_file pythia_card output_file
config_file - configuration file in Tcl format,
output_config_file - configuration file steering the content of the edm4hep output in Tcl format,
pythia_card - Pythia8 configuration file,
output_file - output file in ROOT format.
telling us which input arguments it expects. Let’s go through them:
-
The
config_file
: This is our Delphes card, which contains the parametrisations of the resolutions and efficiencies for a specific detector concept. We will use the baseline FCC-hh Delphes card. It comes pre-installed with thekey4hep
software stack, and you can find the main card under:$DELPHES_DIR/cards/FCC/scenarios/FCChh_II.tcl
There are many other cards, for different (future) colliders and detectors in$DELPHES_DIR/cards/
, which you can also view in your browser on on github. -
The
output_config_file
file: This file defines which collections we have in our output EDM4HEP output file and what their names are. We will use the standard version which comes with the software stack installation as$K4SIMDELPHES/edm4hep_output_config.tcl
. -
The
pythia_card
: As the name says, this is the configuration card forPythia
. Here we use the one provided asPythiaCard/tester_pwp8_pp_hh_5f_hhbbyy.cmd
. It tellsPythia
to run over the di-Higgs LHE file provided (LHEInput/lhe_tester_ggHH.lhe
). -
The
output_file
: This is simply the name of the output file that will be produced, you can pick it freely but remember to explicitly include the.root
file format ending.
Task: Run the event simulation using the cards and inputs as described above.
Solution
`DelphesPythia8_EDM4HEP $DELPHES_DIR/cards/FCC/scenarios/FCChh_II.tcl $K4SIMDELPHES/edm4hep_output_config.tcl ./PythiaCard/tester_pwp8_pp_hh_5f_hhbbyy.cmd tutorial_output.root`You should see Pythia
starting up and summarizing its settings, for example
(...)
*------- PYTHIA Process Initialization --------------------------*
| |
| We collide p+ with p+ at a CM energy of 1.000e+05 GeV |
| |
|------------------------------------------------------------------|
(...)
and a list of modules that it initialised, such as importantly Delphes
and all its submodules required by the input card:
(...)
** INFO: initializing module Delphes
** INFO: initializing module ParticlePropagator
** INFO: initializing module ChargedHadronTrackingEfficiency
** INFO: initializing module ElectronTrackingEfficiency
** INFO: initializing module MuonTrackingEfficiency
** INFO: initializing module ChargedHadronMomentumSmearing
** INFO: initializing module ElectronMomentumSmearing
** INFO: initializing module MuonMomentumSmearing
(...)
Then it will process 10 events, which should be quick. Let’s first take a step back to understand what we processed here exactly.
Take a look at the Pythia
card, and find the answers to the following questions:
- Can you find the line where we limit the run to 10 events only?
- In block 4 of the
Pythia
card we use theResonanceDecayUserHook
plug-in to filter decays for specific final state particles. Which is our desired final state, i.e. which Higgs decays do we allow? - Where is the LHE file that we load in
Pythia
? Open it and take a look here also.- With which generator were our input events generated?
- How many events are in the LHE file?
- From which collision, at which energy, and in which production mode are the Higgs boson pairs produced?
Answers
- l2 of the Pythia card - We are filtering for bbyy decays, so one Higgs boson (PDG ID 25) decays to a pair of b-quarks (PDG ID 5) and the other to two photons (PDG ID 22). - The LHE file is found in LHEInput\lhe_tester_ggHH.lhe. - POWHEG-BOX-V2, see l3 of the LHE file. - It contains 10 000 events, see l27 of the LHE file. - The production mode is gluon-gluon fusion, in a proton proton collison at 100 TeV, see l5, l28f and l30f.Next, lets look at the Delphes
card to see how the fast simulation works. For the FCC-hh scenario, we are modelling a detector layout of an inner tracker, followed by electromagnetic and hadronic calorimenters, as well as a separate muon system, which provides better precision and efficiency on muon measurements than the tracker. Parametrizations in bins of the pseudorapidity η and the transverse momentum pT are used to model the response across the different regions of the detector. Roughly, the fast simulation proceeds in the following main steps:
- We start from all stable particles, as in particles that are written out by
Pythia
as outgoing particles, that do not further decay, at generator level. These are the input to theParticlePropagator
module, which propagates them through the magnetic field of the inner trackers. Neutral particles are propagated in a straight line, while charged particles are deflected on a heliocoidal trajectory - in each case the trajectory is modelled upto the point where the particle enters the calorimeter. Here, the magnetic field strength and coverage of the field (= radius of the inner tracker) are user-defined properties, that depend on the detector scenario we want to study. - Next, tracks are created for all charged particles, by assuming a given probability for the track to be reconstructed - i.e. tracking efficiency -, as well as a certain momentum resolution of the track. The parametrizations of the track efficiency and resolution depend on the particle type, e.g. you can find the muon tracking efficiency in the module
MuonTrackingEfficiency
and the corresponding momentum resolution function in the separate filemuonMomentumResolution_II.tcl
. Especially the latter can look quite complicated, so here is an example of how the parametrizations for muons look if you plot them. - For simulating the calorimeter response, a segmentation in η and φ is given in the
Calorimeter
module, and it is assumed that each particle deposits its energy into one of such segments (then called a tower). This module also specifies which fraction of a particle’s energy is deposited in the electromagnetic and hadronic calorimeter. The energy deposits in the electromagnetic and hadronic calorimeters are then smeared independently, following parametrizations in energy and pseudo-rapidity, as defined in the card. - Particle-flow objects are then built from the tracks and calorimeter towers, forming our physics objects.
- Jets are then reconstructed from these particle flow objects using the standard algorithms, here we will use the
FastJetFinder04
module which uses the anti-kT algorithm with R=0.4. - Flavour tagging efficiencies are applied to the reconstructed jets, again as a function of η and pT, for several working points, such as the medium b-tagging working point in module
BTaggingM
. - Identification efficiency parametrizations are defined for objects of interest such as photons, muons and electrons, for example in the
PhotonEfficiency
module. These efficiencies are applied on top of the tracking efficiencies. Furthermore, for these objects an isolation variable (namedPTRatioMax
in the card) is defined in e.g.ElectronIsolation
. - Last, an overlap removal between the reconstructed objects is performed in the
UniqueObjectFinder
. As you can see here we are using isolated photons, electrons and muons and the jets reconstruced byFastJetFinder04
. - The last block of the card, the
TreeWriter
module, shows you which objects are propagated to Delphes output level. Note that this doesn’t mean we will have them in our.root
file that we created above, as there is still one step in our simulation chain, which is the conversion fromDelphes
output toEDM4HEP
events.
Questions:
- In which η range are we reconstructing muon tracks?
- Which fraction of their energy do electrons deposit in the ECal? What about Kaons?
- What is the minimum pT of a reconstructed jet?
-
What is the flavour tagging efficiency for a b-jet within η < 4.0 and a pT of 250 GeV? - What is the maximum photon identification efficiency?
Answers
- Muons tracks are reconstructed up to |eta| < 6, l255 of the Delphes card. - Electrons deposit 100% of their energy in the ECal, for Kaons it's 30%, l379 and l393f. - Jets with pT > 25 GeV are reconstructed, l709. - The b-tagging efficiency in this bin is 83%, l1225. - The maximum photon identification efficiency is in the bin |eta| < 4.0, pT > 10 GeV and it's around 90% (0.95^2), l1009 and l1013.Finally, let’s take a look at the edm4hep_output_config.tcl
file. This tells us which Delphes
objects we want to store in our EDM4HEP
output and under what name. Can you match the objects here to those in the Delphes card?
Task: Add a new EDM4HEP
output collection to store the photons before the isolation requirement and the overlap removal and rerun the DelphesPythia8
step.
Optional task: If you want to create your own file with higher statistics to use for the next part of the tutorial, you can edit the Pythia
card to use the full LHE statistics. As this takes some time to run, please submit a job for running this to the batch. The job will take at least 40 min, so this step is fully optional, only if you have time. If you are not familiar with submitting to the batch, there are some pointers here.
To wrap up this part of the tutorial, let’s take a quick look at the .root
output file we produced. You can open and inspect as usual. The tree in it is called events
. It follows the EDM4HEP
data model. This is a common event data model for future collider experiments, or in other words, a common format convention on what information of each event is stored in the ROOT
file and how exactly. In detail the format looks like this:
If you are doing analysis, you will mostly be interested in the ReconstructedParticle
collection, which contains all objects reconstructed from the detector responses such as tracks and calorimeter clusters, with the ParticleID
collection telling us the (likely) type of particle. You can take a look at some of the raw distributions here already, to convince yourself that you actually wrote out some events, but in the next part we will use podio
to plot in a more convenient way.
Note: You have produced an EDM4HEP
file in the previous tutorial on event generation already. How is it different from the new one we produced here?
Simple plots with podio
In this part of the tutorial, we will use the podio
toolkit from the key4hep
stack to make some simple plots from the EDM4HEP
file. In a nutshell, podio
generates the code (i.e. functions to access the values) from reading the event data model definition. If you are interested in the technical details, you can read more about it here.
The event data model of EDM4HEP
is given in this .yaml
file. If you look at the block starting at l455, you can see what information we have available on our ReconstructedParticle
s.
With podio
, we can now read the file, access the tree and do a regular event loop. A template of the code needed for this is given in plotting/plot_edm4hep_template.py
.
You can use your own file from the previous step, if you produced it, or you can copy it from /user/bstapf/Topical_Lectures_March2024/future_particle_colliders_tutorial/Tutorial2_DELPHES/tutorial_output_noIso.root
or download it from here.
Task 1 Plot the number of photons and the invariant mass of the first photon pair in the event, myy. You can use utils.p4(<ReconstructedParticle>)
to retrieve the four momentum of a ReconstructedParticle
.
Task 2 Add also the distribution of number of photons before the isolation requirement, using the extra container you added in the previous part of the tutorial. How does it compare to the number of isolated photons?
Additionally, you can also plot the IsolationVar
itself, that a cut is made on during the Delphes
fast simulation. Look into the EDM4HEP
for the name of this collection, and how you can read the value of this for one photon in the event. Plot it for the isolated, as well as the non-isolated photons. What value did we cut on to define an isolated photon, can you find this back in the Delphes
card? Hint: Plotting with logarithmic y-axis is helpful here.