Trigger and Data Acquisition an ATLAS case study

  • Slides: 46
Download presentation
Trigger and Data Acquisition: an ATLAS case study Standard Diagram of ATLAS Trigger +

Trigger and Data Acquisition: an ATLAS case study Standard Diagram of ATLAS Trigger + DAQ Aim is to understand most of this diagram by the end of the lecture! Trigger and Data Acquisition (DAQ) 1

Outline • • Basic Trigger and DAQ concepts The LHC/ATLAS Challenge Multiple Level Triggers

Outline • • Basic Trigger and DAQ concepts The LHC/ATLAS Challenge Multiple Level Triggers Dataflow and Buffering • Birmingham’s role: Level-1 Calorimeter Trigger and Data Acquisition (DAQ) 2

BASIC TRIGGER AND DAQ CONCEPTS Trigger and Data Acquisition (DAQ) 3

BASIC TRIGGER AND DAQ CONCEPTS Trigger and Data Acquisition (DAQ) 3

This is not a Trigger. . . Ceci n’est pas un trigger • With

This is not a Trigger. . . Ceci n’est pas un trigger • With thanks to René Magritte Trigger and Data Acquisition (DAQ) 4

This is a trigger. . . Higgs Candidates enter the Gates of Heaven (ie

This is a trigger. . . Higgs Candidates enter the Gates of Heaven (ie they are saved for later analysis) Boring events are thrown into the pit of hell (ie the data is discarded – it often never even leaves the detector) Trigger and Data Acquisition (DAQ) 5

This is DAQ. . . Detector Data consists of thousands of small fragments Need

This is DAQ. . . Detector Data consists of thousands of small fragments Need to assemble fragments of the same event and send them to disk (about 1. 5 MByte per event) Trigger and Data Acquisition (DAQ) 6

Not to be confused with. . . Trigger and Data Acquisition (DAQ) 7

Not to be confused with. . . Trigger and Data Acquisition (DAQ) 7

The Large Hadron Collider • • pp collisions, √s up to 14 Te. V

The Large Hadron Collider • • pp collisions, √s up to 14 Te. V * Bunch spacing: 25 ns * Nominal luminosity: 1034 cm-2 s-1 Collisions per crossing: ~30 ** • The trigger challenge for the ‘General Purpose Detectors’: * Eventually ** Now! – Roughly 1 GHz known physics – Large event sizes: O(Mbytes) – Typically small rate of ‘new’ physics channels Trigger and Data Acquisition (DAQ) 8

THE LHC/ATLAS CHALLANGE Trigger and Data Acquisition (DAQ) 9

THE LHC/ATLAS CHALLANGE Trigger and Data Acquisition (DAQ) 9

Why is it so difficult? 1) Multiple collisions per bunch • Currently up to

Why is it so difficult? 1) Multiple collisions per bunch • Currently up to 40 separate proton/proton collisions every time two bunches interact Trigger and Data Acquisition (DAQ) 10

Why is it so difficult? 2) New physics is rare • Of course if

Why is it so difficult? 2) New physics is rare • Of course if it wasn’t rare, we’d have seen it already All Events • For example, in 2010 -2012 – A handful of ‘golden’ Higgs candidates (decaying to 4 leptons) – Over 109 events recorded – Over 1014 bunch crossings – Over 1015 proton collisions Trigger and Data Acquisition (DAQ) ATLAS Physics 11

Why is it so difficult? 3) Time between collisions • Finding new physics in

Why is it so difficult? 3) Time between collisions • Finding new physics in an LHC bunch collision is like finding a needle in a haystack • Then you get a new haystack 25 ns later 25 ns Trigger and Data Acquisition (DAQ) 12

Why is it so difficult? 4) Data size and data rate • A single

Why is it so difficult? 4) Data size and data rate • A single event is ~1. 5 MByte • Full data rate is ~5 TBytes/sec – Would fill a few disks – Can’t afford to even transmit all this from the detector • Typically record ~50 TBytes per day – That’s >1000 HD DVDs worth of data – Even at this rate, we’re pushing network limitations in a large PC farm • ‘Bandwidth’ is a vital concept in DAQ – Essentially just: “how fast can you get data from one point to another” Trigger and Data Acquisition (DAQ) 13

How is it possible? • Some new physics processes have clear signatures • Interesting

How is it possible? • Some new physics processes have clear signatures • Interesting particles have high mass – And hence decay to give high energy products – High energy electrons, photons and muons are relatively easy • High energy taus are less easy to distinguish – Also high energy neutrinos can be ‘seen’ by energy imbalance in event (Missing Energy) • Unfortunately some new physics is easily confused with ‘boring’ background – eg Higgs decays predominantly to b quarks producing many jets – These are not so different from events where no new particle is produced – Nevertheless, triggering on high-energy and multiple jets is useful Trigger and Data Acquisition (DAQ) 14

Some examples Trigger and Data Acquisition (DAQ) 15

Some examples Trigger and Data Acquisition (DAQ) 15

MULTIPLE LEVEL TRIGGERS Trigger and Data Acquisition (DAQ) 16

MULTIPLE LEVEL TRIGGERS Trigger and Data Acquisition (DAQ) 16

Why have 3 (plus) levels? • Ideally we would study every event carefully before

Why have 3 (plus) levels? • Ideally we would study every event carefully before accepting or rejecting – One decision = single level trigger • We don’t because: – Not enough bandwidth to get all data out – Not enough bandwidth or processing time to build every event – Not enough processing time to analyse every event in full detail • Note: requirements of trigger are intrinsically linked to DAQ capacity • So we do things in stages – Each level of trigger: • Requires more information and processing time to make a better decision • Compensates for extra bandwidth required by next trigger level by reducing output rate Trigger and Data Acquisition (DAQ) 17

Differences in Levels • ‘Low-level’ triggers are faster and less precise – Typically custom

Differences in Levels • ‘Low-level’ triggers are faster and less precise – Typically custom hardware with a fixed processing time (known as latency) – All data must be stored before Level-1 trigger decision is made • But only partial event information can be used • ‘High-level’ triggers are slower and closer to analysis quality measurements – Typically standard high-spec PCs running trigger specific software – Can use partial or full event information • But only for those events already accepted by lower levels Trigger and Data Acquisition (DAQ) 18

Level-1 Trigger in ATLAS • Has to make a decision on every bunch crossing

Level-1 Trigger in ATLAS • Has to make a decision on every bunch crossing – every 25 ns, ie works at 40 MHz • Selects about 1 in 500 events • Decision time fixed at 2. 0 μs – About ½ of that is used up just in signal transmission • Mostly based on calorimeter and muon data – At reduced resolution • Detector data stored in ‘pipeline memories’ for 2. 5 μs – The ‘triggered’ data is taken out of scrolling buffer at a dead-reckoned fixed time – This all happens in the detector itself Trigger and Data Acquisition (DAQ) 19

Interesting Event Happens Detector Data Pipeline ‘Buffer’ Trigger and Data Acquisition (DAQ) 20

Interesting Event Happens Detector Data Pipeline ‘Buffer’ Trigger and Data Acquisition (DAQ) 20

1 microsecond after BANG Detector Data Pipeline ‘Buffer’ Trigger Data Level-1 Logic Trigger and

1 microsecond after BANG Detector Data Pipeline ‘Buffer’ Trigger Data Level-1 Logic Trigger and Data Acquisition (DAQ) 21

2 microseconds after BANG Detector Data Pipeline ‘Buffer’ Trigger Data Level-1 Logic Trigger Decision

2 microseconds after BANG Detector Data Pipeline ‘Buffer’ Trigger Data Level-1 Logic Trigger Decision YES Trigger and Data Acquisition (DAQ) If Level-1 says ‘yes’ data are copied to a different buffer Event identifier attached for building 22

2. 5 microseconds after BANG Detector Data Pipeline ‘Buffer’ Trigger Data Level-1 Logic Trigger

2. 5 microseconds after BANG Detector Data Pipeline ‘Buffer’ Trigger Data Level-1 Logic Trigger Decision NO Trigger and Data Acquisition (DAQ) If Level-1 says ‘no’ data fall off end of memory and are lost forever 23

The next stage • After Level-1 decision, data is transmitted off the detector –

The next stage • After Level-1 decision, data is transmitted off the detector – Up to now everything is ‘synchronous’ – From now on everything works ‘asynchronously’ • Requires careful formatting and labeling of fragments of data to associate to a particular event, and detector element • Performed in hardware known as Readout Drivers (ROD) • Data now ready for software trigger – But still can’t access ALL of the data – Solution just look in Regions of Interest • Regions identified by Level-1 which causes it to say ‘YES’ • Level-2 Trigger accesses full data for part of the detector Trigger and Data Acquisition (DAQ) 24

Level-2 Trigger: Operation • Level-2 decision time 10 -100 ms – Performed in parallel

Level-2 Trigger: Operation • Level-2 decision time 10 -100 ms – Performed in parallel in a ‘farm’ of PCs – Variable execution time implies need for buffering • We’ll talk more about buffering later • Level-2 requests data corresponding to Ro. Is – Full data taken from the buffers (ROBs) Trigger and Data Acquisition (DAQ) 25

Level-2 Trigger: Decision • Further event reduction by a factor of 10 -20 –

Level-2 Trigger: Decision • Further event reduction by a factor of 10 -20 – If the decision is NO, data deleted from buffers – If the decision is YES, data sent to Event Builder • Events now pieced together (à la IKEA) for the first time Trigger and Data Acquisition (DAQ) 26

We still can’t take the rate: Level-3 (or Event Filter) • • Eagerly awaiting

We still can’t take the rate: Level-3 (or Event Filter) • • Eagerly awaiting Physicists and Ph. D students Another software trigger Another processing farm Another event reduction (~10 -20) The main difference is this one can see all the event – It also can take seconds, not sub-seconds Trigger and Data Acquisition (DAQ) 27

DATAFLOW AND BUFFERING Trigger and Data Acquisition (DAQ) 28

DATAFLOW AND BUFFERING Trigger and Data Acquisition (DAQ) 28

Dataflow • Dataflow is about how you can most efficiently get data from one

Dataflow • Dataflow is about how you can most efficiently get data from one place to another – For a single link, it’s the bandwidth – But for a complex trigger/DAQ system, with multiple components, it’s a lot more complex • The flipside to triggering is dataflow – If you can trigger precisely, you need less dataflow – If you have plenty of bandwidth, you don’t need to trigger • Most real experiments require a delicate balance between the two – And a careful analysis of bottleneck location Trigger and Data Acquisition (DAQ) 29

Bottlenecks and Buffers • There are two different types of bottleneck in a DAQ

Bottlenecks and Buffers • There are two different types of bottleneck in a DAQ system – Average data flow above the link bandwidth – This can only be solved by increasing link capacity • Or sporadic excessive data payload – Can be fixed by buffering Trigger and Data Acquisition (DAQ) 30

Buffering lessons from History • Information/data flow is not so different from particle/object/traffic flow

Buffering lessons from History • Information/data flow is not so different from particle/object/traffic flow • Here’s an example inspired by an ATLAS networking guru • Consider a Roman legion – One soldier = one bit of data – One legion = one packet of data – Roman road = data link Trigger and Data Acquisition (DAQ) 31

Link bandwidth; one legion ~ten miles/day Scouts General Legionaries Vanguard Baggage Rearguard ~10 Miles

Link bandwidth; one legion ~ten miles/day Scouts General Legionaries Vanguard Baggage Rearguard ~10 Miles line speed, latency and pay load all interlinked. 10 miles is only 3. 5 hours march But to transfer the payload at 6 abreast takes 8 hours @ 100% link occupancy. What happens when the road narrows? Trunking deployed: 6 Legionaries march abreast If the road is wide enough Link Dimensions Widest Road leaving Rome Via Appia at its widest 12 tables standard Via Appia repaved 450 BC 12 Tables link dimension standard Original Via Appia Narrowest

Answer: restriction requires buffering at each end • A buffer is just a place

Answer: restriction requires buffering at each end • A buffer is just a place to queue data that is waiting to be dealt with • Different types of buffer – Custom hardware – Simple memory – Temporary disk • Implementation depends on size, speed and waiting time • Average speed determined by slowest link Output Buffer Input Buffer – But instantaneous data/packet flow can exceed this for a period of time Trigger and Data Acquisition (DAQ) 33

The Romans understood this Trunk Road to Gaul At either end of mountain pass:

The Romans understood this Trunk Road to Gaul At either end of mountain pass: Great St Bernard’s Pass Aosta and Martigny – two very pleasant old Roman towns Trunk Road to Rome Trigger and Data Acquisition (DAQ) 34

The modern equivalent • Mont Blanc tunnel approach road Buffer Zone To Italy Trigger

The modern equivalent • Mont Blanc tunnel approach road Buffer Zone To Italy Trigger and Data Acquisition (DAQ) 35

‘Event Building’ in Rome • Event (Legion) Building requires a large buffer Output Link

‘Event Building’ in Rome • Event (Legion) Building requires a large buffer Output Link – Need to wait until all fragments have arrived Buffer Zone Input Links • Unless all data from all detectors is the same size, will have to wait for the largest/slowest/last fragment Trigger and Data Acquisition (DAQ) 36

Filling buffers and dead-time • However, buffers come with their own problems – Must

Filling buffers and dead-time • However, buffers come with their own problems – Must choose a sensible finite size – What happens when it fills? • Require a ‘Busy’ protocol to stop new data being formed – Introduces dead-time • Use inhibits and ‘leaky-bucket’ algorithms to limit traffic Trigger and Data Acquisition (DAQ) 37

Is this now any clearer? Trigger and Data Acquisition (DAQ) 38

Is this now any clearer? Trigger and Data Acquisition (DAQ) 38

BIRMINGHAM’S ROLE: THE LEVEL-1 CALORIMETER TRIGGER (L 1 CALO) Trigger and Data Acquisition (DAQ)

BIRMINGHAM’S ROLE: THE LEVEL-1 CALORIMETER TRIGGER (L 1 CALO) Trigger and Data Acquisition (DAQ) 39

Triggering in ATLAS • Three-stage triggering system – Level-1: custom-built hardware, fixed latency –

Triggering in ATLAS • Three-stage triggering system – Level-1: custom-built hardware, fixed latency – target rate 75 k. Hz – Level-2: mostly software, Ro. I-based selection – target rate 5000 Hz – Event Filter: software, full detector – target rate 400 Hz Calorimeters Muon Detectors Calorimeter Trigger Muon Trigger e/γ tau • All data buffered at bunch-crossing rate of 40 MHz for 2. 5 ms jet ET ΣET Central Trigger Processor μ Level-1 Trigger • Level-1 has three sub-systems: – Calorimeter Trigger – Muon Trigger – Central Trigger (CTP) Trigger to Front-end Buffers Trigger and Data Acquisition (DAQ) Regions of Interest (Ro. I) to Level-2 40

Trigger Algorithms Jet/Energy-sum Processor ECAL+HCAL Cluster Processor • e/γ or τ/hadron algorithm – Central

Trigger Algorithms Jet/Energy-sum Processor ECAL+HCAL Cluster Processor • e/γ or τ/hadron algorithm – Central cluster > threshold – Isolation requirements in surrounding rings – Local ET maximum – 16 thresholds possible • Jet algorithm – Programmable size – Energy in (em+had) > threshold – 8 size/threshold sets • 8 Missing-ET, 8 Sum ET plus 8 Missing-ET significance thresholds Trigger and Data Acquisition (DAQ) 41

USA 15 Installation Trigger and Data Acquisition (DAQ) 42

USA 15 Installation Trigger and Data Acquisition (DAQ) 42

Hardware Implementation • Multiple ‘layers’ of FPGA processing • Data reception and fanout •

Hardware Implementation • Multiple ‘layers’ of FPGA processing • Data reception and fanout • Algorithmic processing • Result merging • Final stages in common CMM Trigger and Data Acquisition (DAQ) 43

Processor custom Backplane • Dense, high bandwidth backplane – Up to 1, 150 pins

Processor custom Backplane • Dense, high bandwidth backplane – Up to 1, 150 pins per slot – About 20, 000 pins in all • Common to CP and JEP systems • Fastest signal speeds: – 480 MHz differential (LVDS input) – 160 MHz single ended CP system – 80 MHz single ended JEP system Trigger and Data Acquisition (DAQ) 44

Installation: Analogue Cables 496 cables into 8 crates Four cables just fit front of

Installation: Analogue Cables 496 cables into 8 crates Four cables just fit front of one 9 U module Trigger and Data Acquisition (DAQ) 45

Installation: Digital Cabling Up to 1400 individual LVDS signals into one crate More than

Installation: Digital Cabling Up to 1400 individual LVDS signals into one crate More than 500 Gbit/s data input Trigger and Data Acquisition (DAQ) 46