In this tutorial you will learn how to use the common key4hep tools for fast, parametrized detector simulation with Delphes. We will use the FCC-hh baseline detector scenario and run on Higgs pair production events. We assume that the actual event generation step is already done, and so we will start from existing Les Houches Event (LHE) files. We will then use Pythia for the hadronization & particle decays, and Delphes for the detector simulation, resulting in EDM4HEP “reco”-level events.

You will get an overview of how the parametrized detector response simulation works, as well as of the EDM4HEP event data model.

In the second part of the tutorial, we will make some simple plots from the EDM4HEP we produced.

Fast simulation with k4SimDelphes starting from existing LHE files

Check if you have setup the software stack and the DelphesPythia8_EDM4HEP executable is available, by running which DelphesPythia8_EDM4HEP. If this doesn’t return a path like /cvmfs/sw.hsf.org/key4hep/<somewhere>/DelphesPythia8_EDM4HEP please follow the instructions here (again).

For the usecase of running Delphes fast simulation to produce EDM4HEP files, we only need this executable. But you can check which other Delphes utilities are available in your installation by just using the autocomplete functionality, i.e. typing Delphes - this will show you the whole list. They all interface different modules and run with different input/output formats - the names should give you a clue which ones exactly. As mentioned already `DelphesPythia8_EDM4HEP` interfaces `Delphes` and `Pythia8` and outputs `EDM4HEP` files, rather than using the `Delphes` output file format. Another example is `DelphesSTDHEP_EDM4HEP` which uses `STDHEP` as input file format. Here you can really see the strength of having a common software ecosystem like `key4hep` - the same utility is offered for many specific usecases.

We can check how to actually run the exectuable in question with the help option, for example:

DelphesPythia8_EDM4HEP -h

returns

Usage: DelphesPythia8config_file output_config_file pythia_card output_file
config_file - configuration file in Tcl format,
output_config_file - configuration file steering the content of the edm4hep output in Tcl format,
pythia_card - Pythia8 configuration file,
output_file - output file in ROOT format.

telling us which input arguments it expects. Let’s go through them:

The config_file: This is our Delphes card, which contains the parametrisations of the resolutions and efficiencies for a specific detector concept. We will use the baseline FCC-hh Delphes card. It comes pre-installed with the key4hep software stack, and you can find the main card under: $DELPHES_DIR/cards/FCC/scenarios/FCChh_II.tcl There are many other cards, for different (future) colliders and detectors in $DELPHES_DIR/cards/, which you can also view in your browser on on github.
The output_config_file file: This file defines which collections we have in our output EDM4HEP output file and what their names are. We will use the standard version which comes with the software stack installation as $K4SIMDELPHES/edm4hep_output_config.tcl.
The pythia_card: As the name says, this is the configuration card for Pythia. Here we use the one provided as PythiaCard/tester_pwp8_pp_hh_5f_hhbbyy.cmd. It tells Pythia to run over the di-Higgs LHE file provided (LHEInput/lhe_tester_ggHH.lhe).
The output_file: This is simply the name of the output file that will be produced, you can pick it freely but remember to explicitly include the .root file format ending.

Task: Run the event simulation using the cards and inputs as described above.

Solution

`DelphesPythia8_EDM4HEP $DELPHES_DIR/cards/FCC/scenarios/FCChh_II.tcl $K4SIMDELPHES/edm4hep_output_config.tcl ./PythiaCard/tester_pwp8_pp_hh_5f_hhbbyy.cmd tutorial_output.root`

You should see Pythia starting up and summarizing its settings, for example

 (...)

*-------  PYTHIA Process Initialization  --------------------------*
 |                                                                  |
 | We collide p+ with p+ at a CM energy of 1.000e+05 GeV            |
 |                                                                  |
 |------------------------------------------------------------------|

 (...)

and a list of modules that it initialised, such as importantly Delphes and all its submodules required by the input card:

 (...)

** INFO: initializing module  Delphes                  
** INFO: initializing module  ParticlePropagator       
** INFO: initializing module  ChargedHadronTrackingEfficiency
** INFO: initializing module  ElectronTrackingEfficiency
** INFO: initializing module  MuonTrackingEfficiency   
** INFO: initializing module  ChargedHadronMomentumSmearing
** INFO: initializing module  ElectronMomentumSmearing 
** INFO: initializing module  MuonMomentumSmearing

 (...)     

Then it will process 10 events, which should be quick. Let’s first take a step back to understand what we processed here exactly.

Take a look at the Pythia card, and find the answers to the following questions:

Can you find the line where we limit the run to 10 events only?
In block 4 of the Pythia card we use the ResonanceDecayUserHook plug-in to filter decays for specific final state particles. Which is our desired final state, i.e. which Higgs decays do we allow?
Where is the LHE file that we load in Pythia? Open it and take a look here also.
- With which generator were our input events generated?
- How many events are in the LHE file?
- From which collision, at which energy, and in which production mode are the Higgs boson pairs produced?

Answers

- l2 of the Pythia card - We are filtering for bbyy decays, so one Higgs boson (PDG ID 25) decays to a pair of b-quarks (PDG ID 5) and the other to two photons (PDG ID 22). - The LHE file is found in LHEInput\lhe_tester_ggHH.lhe. - POWHEG-BOX-V2, see l3 of the LHE file. - It contains 10 000 events, see l27 of the LHE file. - The production mode is gluon-gluon fusion, in a proton proton collison at 100 TeV, see l5, l28f and l30f.

Next, lets look at the Delphes card to see how the fast simulation works. For the FCC-hh scenario, we are modelling a detector layout of an inner tracker, followed by electromagnetic and hadronic calorimenters, as well as a separate muon system, which provides better precision and efficiency on muon measurements than the tracker. Parametrizations in bins of the pseudorapidity η and the transverse momentum pT are used to model the response across the different regions of the detector. Roughly, the fast simulation proceeds in the following main steps:

We start from all stable particles, as in particles that are written out by Pythia as outgoing particles, that do not further decay, at generator level. These are the input to the ParticlePropagator module, which propagates them through the magnetic field of the inner trackers. Neutral particles are propagated in a straight line, while charged particles are deflected on a heliocoidal trajectory - in each case the trajectory is modelled upto the point where the particle enters the calorimeter. Here, the magnetic field strength and coverage of the field (= radius of the inner tracker) are user-defined properties, that depend on the detector scenario we want to study.
Next, tracks are created for all charged particles, by assuming a given probability for the track to be reconstructed - i.e. tracking efficiency -, as well as a certain momentum resolution of the track. The parametrizations of the track efficiency and resolution depend on the particle type, e.g. you can find the muon tracking efficiency in the module MuonTrackingEfficiency and the corresponding momentum resolution function in the separate file muonMomentumResolution_II.tcl. Especially the latter can look quite complicated, so here is an example of how the parametrizations for muons look if you plot them.
For simulating the calorimeter response, a segmentation in η and φ is given in the Calorimeter module, and it is assumed that each particle deposits its energy into one of such segments (then called a tower). This module also specifies which fraction of a particle’s energy is deposited in the electromagnetic and hadronic calorimeter. The energy deposits in the electromagnetic and hadronic calorimeters are then smeared independently, following parametrizations in energy and pseudo-rapidity, as defined in the card.
Particle-flow objects are then built from the tracks and calorimeter towers, forming our physics objects.
Jets are then reconstructed from these particle flow objects using the standard algorithms, here we will use the FastJetFinder04 module which uses the anti-kT algorithm with R=0.4.
Flavour tagging efficiencies are applied to the reconstructed jets, again as a function of η and pT, for several working points, such as the medium b-tagging working point in module BTaggingM.
Identification efficiency parametrizations are defined for objects of interest such as photons, muons and electrons, for example in the PhotonEfficiency module. These efficiencies are applied on top of the tracking efficiencies. Furthermore, for these objects an isolation variable (named PTRatioMax in the card) is defined in e.g. ElectronIsolation.
Last, an overlap removal between the reconstructed objects is performed in the UniqueObjectFinder. As you can see here we are using isolated photons, electrons and muons and the jets reconstruced by FastJetFinder04.
The last block of the card, the TreeWriter module, shows you which objects are propagated to Delphes output level. Note that this doesn’t mean we will have them in our .root file that we created above, as there is still one step in our simulation chain, which is the conversion from Delphes output to EDM4HEP events.

Questions:

In which η range are we reconstructing muon tracks?
Which fraction of their energy do electrons deposit in the ECal? What about Kaons?
What is the minimum pT of a reconstructed jet?
What is the flavour tagging efficiency for a b-jet within η < 4.0 and a pT of 250 GeV?
What is the maximum photon identification efficiency?

Answers

- Muons tracks are reconstructed up to |eta| < 6, l255 of the Delphes card. - Electrons deposit 100% of their energy in the ECal, for Kaons it's 30%, l379 and l393f. - Jets with pT > 25 GeV are reconstructed, l709. - The b-tagging efficiency in this bin is 83%, l1225. - The maximum photon identification efficiency is in the bin |eta| < 4.0, pT > 10 GeV and it's around 90% (0.95^2), l1009 and l1013.

Finally, let’s take a look at the edm4hep_output_config.tcl file. This tells us which Delphes objects we want to store in our EDM4HEP output and under what name. Can you match the objects here to those in the Delphes card?

Task: Add a new EDM4HEP output collection to store the photons before the isolation requirement and the overlap removal and rerun the DelphesPythia8 step.

Optional task: If you want to create your own file with higher statistics to use for the next part of the tutorial, you can edit the Pythia card to use the full LHE statistics. As this takes some time to run, please submit a job for running this to the batch. The job will take at least 40 min, so this step is fully optional, only if you have time. If you are not familiar with submitting to the batch, there are some pointers here.

To wrap up this part of the tutorial, let’s take a quick look at the .root output file we produced. You can open and inspect as usual. The tree in it is called events. It follows the EDM4HEP data model. This is a common event data model for future collider experiments, or in other words, a common format convention on what information of each event is stored in the ROOT file and how exactly. In detail the format looks like this:

edm4hep data model

If you are doing analysis, you will mostly be interested in the ReconstructedParticle collection, which contains all objects reconstructed from the detector responses such as tracks and calorimeter clusters, with the ParticleID collection telling us the (likely) type of particle. You can take a look at some of the raw distributions here already, to convince yourself that you actually wrote out some events, but in the next part we will use podio to plot in a more convenient way.

Note: You have produced an EDM4HEP file in the previous tutorial on event generation already. How is it different from the new one we produced here?

Simple plots with podio

In this part of the tutorial, we will use the podio toolkit from the key4hep stack to make some simple plots from the EDM4HEP file. In a nutshell, podio generates the code (i.e. functions to access the values) from reading the event data model definition. If you are interested in the technical details, you can read more about it here.

The event data model of EDM4HEP is given in this .yaml file. If you look at the block starting at l455, you can see what information we have available on our ReconstructedParticles.

With podio, we can now read the file, access the tree and do a regular event loop. A template of the code needed for this is given in plotting/plot_edm4hep_template.py.

You can use your own file from the previous step, if you produced it, or you can copy it from /user/bstapf/Topical_Lectures_March2024/future_particle_colliders_tutorial/Tutorial2_DELPHES/tutorial_output_noIso.root or download it from here.

Task 1 Plot the number of photons and the invariant mass of the first photon pair in the event, myy. You can use utils.p4(<ReconstructedParticle>) to retrieve the four momentum of a ReconstructedParticle.

Task 2 Add also the distribution of number of photons before the isolation requirement, using the extra container you added in the previous part of the tutorial. How does it compare to the number of isolated photons?

Additionally, you can also plot the IsolationVar itself, that a cut is made on during the Delphes fast simulation. Look into the EDM4HEP for the name of this collection, and how you can read the value of this for one photon in the event. Plot it for the isolated, as well as the non-isolated photons. What value did we cut on to define an isolated photon, can you find this back in the Delphes card? Hint: Plotting with logarithmic y-axis is helpful here.

FCChhPhysicsPerformance

Fast simulation with k4SimDelphes starting from existing LHE files

Simple plots with podio

Further reading