A MULITPLICITY JUMP BTAGGER TODD HUFFMAN OXFORD UNIVERSITY
A MULITPLICITY JUMP BTAGGER TODD HUFFMAN, OXFORD UNIVERSITY J. PHYS. G: NUCL. PART. PHYS 43 (2016) 085001 1 ARXIV: 1701. 06832
PROBLEM: TAG HIGH PT B-JET Aux. Fig. 4 a Pub. Note ATL-PHYS-PUB-2015 -022 Next: Why it happens 2 I Exaggerate, Much great work happening
200 Ge. V B baryon g = 40 gct = 18 mm 1 Te. V B baryon g = 200 gct = 90 mm Radius 1 st layer = 25 mm 3 B-TAGGERS FIGHT EINSTEIN Small cone sizes are prevalent. DR≈0. 04 for a B in a 500+ Ge. V Jet.
A WILD IDEA – MULTIPLICITY JUMP Particle decays between layers into charged daughters, Jump in the number of hits. 4 Pixel detector layers
IDEA – NOT QUITE SO WILD Had been tried in the 80’s and early 90’s in fixed target experiments at Hadron machines. • ~400 Ge. V proton or pion beams • Energy sufficient to make Bottom and Charm baryons • Place detector “upstream”; leave gap; second detector “downstream” • Detectors Scintillators or course resolution silicon detectors • Relied on integrating the ionization signal • Mica detector relied upon jump in amount of Cherenkov light • Look for “Jump” in signal (one example: details here) 5 • Did not work very well Tails in signals • Modern Si pixel detectors have very high granularity. Do better?
B BARYON ENERGY FRACTION (PYTHIA+FASTJET) However; Fraction of Jet energy each B baryon has does not strongly depend on energy of the jet. B-Jet Energy (Ge. V) The energy of B hadrons in our simulation As Jet energy increases, B-taggers less efficient. B baryon Energy Fraction 6 And more B baryons will decay inside the detector volume.
DHIT FRACTION FI We define a quantity we call “Hit-difference Ratio” or “Hit-ratio” for short “fi”. • Use cone DR < 0. 04 around jet axis from Fastjet. • from ith layer fi = (Nhitsi+1 - Nhitsi)/Nhitsi • Can only have positive or zero hits, so: • fi is bounded from below by -1. & unbounded from above. • Have a look at the fi distribution. 7 • Note: This sample 0. 5 to 2. 5 Te. V jets.
DHIT FRACTION = FI – 2. 5 TEV Z’ Jets with a B baryon All gaps containing a decay uds Jets 8 Simulation Z’ u, d, s, c, and b quarks only. This looked promising next use it as a cut variable
APPLYING DHITF CUT Start at fi= -1. 0 (i. e. no cut at all) • And Start increasing the cut. At each cut value, Plot (Number of Events passing cut)/(number of starting events) NOTE! Only count B hadron jets where B decayed inside the layers! • Later: cut less effective with 5 Te. V Z’. DR= 0. 04 might be too big? ? The “ALL Layers” plot is logical OR of individual layers, if any one of the DHitf between any pair of layers passes the cut, the event passes. 9 Charm not included.
FI EFFICIENCY AND PURITY - LAYERS 1234 MZ’ = 2. 5 TEV fi ≥ F=1 Layers: 1 -2 or 2 -3 or 3 -4 !! e n o s se thi Choo 0 1 2 Cut (f 1, f 2, or f 3) ≥ F 0 1 2 Cut (f 2, or f 3) ≥ F Layers: 2 -3 or 3 -4 10 Fiducial region Whole volume, So can compare.
EFFICIENCY VS. JET ENERGY By itself: Less than impressive tagger. Propose to use alongside conventional taggers 11 Might aid tagging of High PT Jets
CURRENT SIMULATION DETAILS Generator level simulation Pythia 8 • pp collider with √s = 13 Te. V • Generate hard QCD; PT > 700 Ge. V • Use Evt. Gen to get B hadron decays correct Jet simulation Fast. Jet 3 • Anti-k. T algorithm forming jets • Keep jets with energy > 350 Ge. V • Can set jet cone size • We’ve used R = 0. 2 Semi-Toy Detector simulation GEANT 4 • Active at radii 25, 50, 88, and 122 mm • Small slabs 50 x 400 x 300 mm (f x z x r) • inner layer 50 x 250 x 300 mm IBL-like • Passive cylinders 2. 5 mm thick to get to X 0=2. 5% per layer 12 • Volume Cylinder 1. 4 m radius filled with air, 2 T mag. Field • Silicon layers
ADDED VALUE Clear separation of b and uds jets • But not what I’d call fantastic • Flat efficiency at Te. V Jet energy – Party piece • No pile-up Add pile-up • Look at up to 200 pile-up events • Most results use 45 events (Poisson) 13 Try Neural Net or a Boosted Decision Tree
Pileup How robust is the tagger against pileup alongside our main event? Implement soft QCD pileup in GEANT 4 simulation with a Poisson distribution.
Pileup – max(f 1, f 2, f 3)
Pileup - Conclusion Tagger insensitive to increased pileup (tested to pileup with λ = 200), due to little change in fi. Makes sense - most pileup does not result in hits in R<0. 04 between layers. NOTE: All plots from now have pileup with λ = 45.
17 UPGRADING THE SIMULATION
Choosing a ML technique Each ML technique has a distinct set of advantages/disadvantages. Compare with Signal Purity vs. Signal Efficiency (so-called “ROC” curve) Training set was enriched sample of 1 million hard QCD events with min interaction p. T at 700 Ge. V, plus 300 k hard QCD ROC curve was events resulting generated by using in B-hadron an independent production. “testing” set.
Building our ANN ROC analysis showed that 8 -layer ANN was as effective as a 2 -layer ANN → not really a “deep learning” problem, fairly reducible and small feature set. Plus, less layers means less chance of overfitting. Used TMVA for a “standard HEP” packaged ANN. Activation function was tanh(x). Input Layer: - Raw hits in each layer - Multiplicites f 1, f 2, f 3 - max(f 1, f 2, f 3) - Jet Energy, p. T and mass 2 Hidden layers: 16 and 5 neurons wide. (Smaller second hidden layer helps prevent overfitting). Architecture comes from educated guessing. Too many combinations to evaluate
Training Sample Training sample had min p. T of interaction at 700 Ge. V. 1 million general hard QCD events (anything goes), 300, 000 hard QCD events resulting in Bhadron production. Testing is then done on an independent hard QCD sample.
ANN TEST MC RESULTS Testing is done on an 21 independent hard QCD sample.
22 WITH ANN OUTPUT>0. 9
23 ANN - STRAIGHT CUT COMPARISON
A FEW LOOSE ENDS Tested if Energy-related inputs were being heavily used • If time – tell your “military tank recognition anecdote” • Concerned ANN would think “high “energy” b-tag!!! • Tried Flat training distribution, no change in eff. Plots. Very idealized detector • No overlaps (which can mimic a hit multiplicity) • Uniform distribution of dead material It’s not that we cannot access ATLAS simulation but… • If CMS starts to look into it… • Actually ATLAS group led by SMU starting out. 24 • Pixel “hit” information is not normally saved • ATLAS policy on making MC studies public is…unprintable • ATLAS collaboration can move VERY fast
CONCLUSIONS Using Mulitvariate techniques significantly improves rejection of uds and charm jets. Results robust to pile-up (small search cone) If continues to show promise, could be added to existing taggers to enhance efficiency at Te. V scale jet energies. 25 Multiplicity Jump technique more immune to searchlight effect and highly boosted B hadrons.
THANK YOU! 26 BACKUP SLIDES
27 DEAD MATERIAL REMOVED – NO EARLY SHOWERS
EFF. OF FI CUT SINGLE GAPS; MZ’ = 2. 5 TEV f 1 first gap e B baryons e uds jets 0 1 2 Layers 1 -2 Hit Fraction, f 1 0 1 2 Layers 2 -3 Hit Fraction, f 2 0 1 2 Layers 3 -4 Hit Fraction, f 3 “Significance” S≡eb/√eq & scaled f 2 next gap Next, try only the “OR” between all gaps 28 f 3 last gap
MAX(F 1, F 2, F 3) > DHIT Completeness Charm jets 2. 5 Te. V 29 Same set of conditions
Upgrading the simulation Making sure it works in a more realistic scenario
Luminous Region Events do not always occur at (0, 0, 0) – modify GEANT 4 sim to account for this. ATLAS has σz= 45 mm New Problem – Jet vectors no longer from (0, 0, 0), have to be careful when selecting pixels. Easily resolved by finding the event primary vertex. Once this is done, luminous region vs origin makes no difference to the quality of the tagger. Vertex Z Displacement
Pileup - fi How does pileup affect our fi?
Upgrading the Tagger We know it works… Let’s make it work better!
How does it do? Neural Net Wins!!
Extra Slides
ENERGY-FRACTION OF B BARYONS? DOES IT DEPEND ON JET ENERGY? Does x. B get worse Larger E-jet? ? All jet energies? • x. B Logarithmic? • x. B Non-linear? • x. B Constant? • If Jet energy more tracks • Helps taggers Let’s at least find out what simulations say! 1. 0 x. B 37 Energy fraction of B baryons as Jet energy increases
- Slides: 37