Confidence intervals for linear combination of Poisson distributions

Confidence intervals for linear combination of Poisson distributions Francisco Matorras IFCA Instituto de Física de Cantabria (Santander, Spain) Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Introduction My journey through a “simple” statistical problem This work was triggered by (apparently) simple and innocent statistical questions Ø When looking for an answer found that Ø q q Ø (Clearly) not so trivial to answer Widely used procedures not totally correct In this talk I show different possible approaches to solve this problem q q trying to define an “appropriate” CI calculating the frequentist coverage for some typical benchmarks Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Motivation Ø May appear as an academic problem, but several real and common examples encountered: q Negative weights in MC generators (S. Frixione and B. R. Webber, Matching NLO QCD computations and parton shower simulations, " JHEP 0206 (2002) 029 [hepph/0204244]): o o q Expected background estimation from low populated samples o q Ø I get “negative counts” on a bin! How to define the error? Subdominant background, I have a bck I do not expect to contribute, but its estimation is scaled by a large number, ie Nexp=m 1+100 m 2, observing n 2=0, what is the error? What errors should I plot when I subtract a background from a data distribution? All these are non trivial cases when not in the gaussian regime Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Posing the problem Ø Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

What I look for Ø A method that provides a good coverage (frequentist) Close to 68% CL q Avoid large overcoverage q Moderate undercoverage might be acceptable q Ø Prioritize methods with shorter interval Ø Correct asymptotical behaviour Ø If possible a simple and “universal” method Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Benchmarks Defined several benchmark for testing: Ø Subtraction of two Poisson (negative weights and background subtraction) q=-p/4 Ø q Ø Subdominant, one observation with a large mean (m 1~ 100) scaled down by a large factor added to another with m 2 close to zero (q 0) q q q Ø Small number of counts, especially when none observed (0, 0) m 1*0. 001 + m 2 m 1*0. 1 + m 2 Some other checks (not shown) q q q=0 or q =p/4 (m 1+ m 2) that follow Poisson large m Normal Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Poor (wo)man methods Ø Gaussian error propagation q q If you just got your degree you Ø Garwood “error” propagation probably will not give a second q If you know a bit more you’ll try thought and use error to propagate the Garwood propagation intervals (Poisson exact CI) Pros: q Pros: o q Very simple, converges asymptotically since it is a linear combination Cons: o o Significantly undercovers for small and moderate counts For (0, 0) predicts a 0 -length interval! o q Cons: o o o Quark confinement XIII, August 2018 Still simple, cures the problems at low number of counts Not well defined (i. e. if we subtract two numbers), follow the most common approach: symmetrize taking the largest Significantly overcovers, even for moderate counts Subtracting n 1=n 2=0 gives [2. 6, 2. 6], other methods outlined here provide good coverage within ± 1. 5 or ± 2 Francisco Matorras, IFCA, Spain

Bibliographic search Ø Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Bibliographic search q Pros: o q Simplementation, even for N-dimensions, fast convergence to asymptotics, good approximate coverage even for counts ~ 1 Cons: o Still fails if n 1 or n 2=0 Ø Modified expansion: include a patch for n=0, in the calculations make the replacement n max(n, 1) q Pros: o q it works Cons: o no solid mathematical justification Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Likelihood Ø “MINOS” approach (equivalent to profile likelihood + Wilks) Calculate the MLE in 2 D (n. D) q draw the contours at log. MLE-1/2 q “profile” the contours (widest range on the variable of interest when varying the nuisance) q Wilks theorem tells us it should converge asymptotically to a chi square, with ndof=2 (two parameters) Quark confinement XIII, August 2018 Bo rro we d fro m F. J am es q Francisco Matorras, IFCA, Spain

Likelihood Ø q= -p / 4 p/ = q q=0 Quark confinement XIII, August 2018 4 Francisco Matorras, IFCA, Spain

Some examples Ø ln. L contours for some given n Ø Contours transforme d into CL according to Wilks Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Likelihood q Pros: o o o q Familiar and well stablished procedure asymptotic convergence (Wilks) Simple interpretation Cons: o o Not so good coverage for low number of counts Rather severe undercoverage for n=(0, 0) Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Profile/marginalize/project… Ø None of this methods is fully satisfactory for the general case Ø Some alternatives were explored, based on different options to eliminate the second rotated mean Ø Not so trivial without further assumptions See for example the case of m 1’=m 1 -m 2 m 2’=m 1+m 2 q m 1’ and m 2’ not (fully) independent q our variable of interest follows a Skellam distribution, with mean m 1 -m 2 and s 2=m 1+m 2 q The width goes to infinite without additional information, it is a pain rather than a nuisance q Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

A possible approach: 2 D confidence regions Ø Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

2 D confidence intervals Ø Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

2 D confidence regions Ø Project the confidence interval onto the rotated axis q Works for large n, and slightly undercovers for small n o If we want to guarantee >68. 3% coverage we should project the 68. 3% confidence region o q q there is a fortunate compensation with discreteness-induced overcoverage, more for the prob-ordered Largely overcovers and coverage does not converge asymptotically Proposal: instead, project the 39% contour, what one would do asymptotically for 2 d. o. f. Quark confinement XIII, August 2018 4 p/ = q q= -p /4 q=0 Francisco Matorras, IFCA, Spain

Profiling Ø Pure “profile construction” does not seem to always work well for this problem Ø Use instead the idea proposed by K. Cranmer in phystat 2003 ar. Xiv: physics/0511028 v 2 [physics. dataan] 4 Jan 2006 q a full Neyman construction over both the parameters of interest and the nuisance parameters, using the profile likelihood ratio as an ordering rule. Ø Perform again 2 D scan with this new ordering rule q Must use the 68. 3% contour (only 1 ndof, the other is profiled out) Ø A bit more complex, but reduces (although not totally eliminates) the dependence on the “nuisance” Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

2 D contours with profiled ratio ordering Ø some examples for q=-p/4 Ø in this case only one projection makes sense (different 2 D for different q) Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

2 D methods Ø Projection q Pros: o o o q shorter intervals less overcoverage Nice graphical interpretation Cons: o Coverage not guaranteed, some undercoverage present in some cases Quark confinement XIII, August 2018 Ø Profile-ordered q Pros: o q More accurate coverage Cons: o calculation and interpretation a bit more complex Francisco Matorras, IFCA, Spain

Alternative ideas Ø Have also tried to integrate the nuisance in the pdf, but didn’t find any well motivated prior beyond boxbounded flat distributions q no satisfactory results found (not shown) Ø Also explored an hybrid method based on the near- independence of the parameter of interest and the nuisance q Not too good results but also shown for comparison Ø If x and y are independent, any rotation will lead to zero correlation coefficient in the rotated x’, y’ does it mean they are independent? q No, see for example the case of x-y and x+y, x-y probability (through the width) depends on x+y q Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

pseudo-marginalization Ø Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Results: benchmark I Subtracting two Poisson with means <10 Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Results: confidence intervals Ø Intervals for n 1=n 2 as a function of n 1 Ø At moderate n all methods provide similar intervals except “Garwood” and marginalized, that give wider intervals Ø Quite different CI at (near) zero Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Results: confidence intervals Ø CI for fixed n 2 Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Some numeric examples CI for x-y calculated with the different methods for cases with 0 or 1 counts (note these are scaled up by sqrt(2) w. r. t. the previous plots Method 0, 0 1, 0 (reverse for 0, 1) 1, 1 Gauss 0. 00 2. 00 -1. 41 Garwood prop -2. 60 -1. 95 3. 95 -3. 25 Expansion 0. 00 0. 38 2. 63 -1. 89 Mod expansion -1. 89 -0. 89 2. 89 -1. 89 MLE +1/2 -0. 50 0. 19 2. 36 -1. 56 2 D prob-ord -1. 43 -0. 64 2. 86 -1. 89 2 D FC -0. 72 -0. 35 2. 13 -2. 12 2 D profile -1. 29 -0. 35 2. 75 -1. 98 marginalized -2. 34 -1. 51 3. 82 -2. 96 Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Coverage tests m 2 plane and calculate the coverage Ø Sample plot: Ø Scan de m 1, Severe undercoverag e Acceptable undercoverag e Good coverage undercoverag e Severe overcoverage Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Coverage Ø 2 D approaches Fairly good coverage q Some undercoverage for means close to 0 for FC (remark: undercoverage induced by projection) q Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Coverage Ø Garwood severely overcovers Ø Gaussian and likelihood slightly undercover with a region of severe undercoverage Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Coverage Ø Expansion method slightly improves gaussian (but not much) Ø With the patch undercoverage corrected at the expense of inducing overcoverage close to 0 Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Summary for benchmark I Gaussian error propagation is naïve, but not so bad once above 2 or 3 counts Ø Garwood error propagation overcovers, I would not recommend it Ø Expansion + patch looks promising, Ø q simple and easily expandable to n-D, desirable to have a better mathematical justification MINOS approach good for counts >2, but shows some severe undercoverage regions, beware! Ø Different 2 D approaches provide better coverage properties (especially for profile based) if more accurate CI needed, can be provides as a table (note, this is a discrete number of n 1, n 2 and only relevant for a few cases) Ø Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Results: benchmark II Subdominant background 0. 1 * n 1 (~100) + n 2 (~0) 0. 01 * n 1 (~100) + n 2 (~0) Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Confidence intervals Ø q large, dominated by gaussian sqrt(100) contrib Ø alternatively for small q, Poisson contrib is dominant q very different answers for the different methods Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Some numeric examples CI calculated with the different methods for cases with n 1=100 and n 2=0 counts, scaled to m 1+ctan(q)* m 2 for better comparison • Note that q=0. 01 means adding a background where you do not see any event scaled by a factor 100 • Quite different answers in this case • If scaled x 10 more similar q=0. 1 q=0. 01 Gauss 90 110 Garwood prop 79 121 0 285 Expansion 90 111 Mod expansion 88 119 37 264 MLE +1/2 92 110 92 132 2 D prob 97 120 97 267 2 d FC 95 108 95 179 2 d prof 90 110 95 213 marginalized 89 121 92 251 Method Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Coverage test Ø Similarly calculate the coverage as a function of the mean for the subdominant background fixing the other to 100 (1 D scan) Garwood prop. largely overcovers q For q=0. 1 FC undercovers a bit, expansion and marg. overcover. The rest perform similarly q For q =0. 01 no method covers without a significant overcoverage, better for the 2 D and mostly ok for patched expansion. Gauss, expansion and MLE undercover q Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Summary for benchmark II Ø Quite different results depending on the scaling of the subdominant background q If it is large we are close to a pure Poisson; if it is very small close to a pure gaussian Ø For intermediate values, could draw conclusions similar to those on benchmark I Gaussian or similar not valid q Garwood overcovers q 2 D more accurate, although overcovering here q Quark confinement XIII, August 2018 Francisco Matorras, IFCA, Spain

Conclusions As usual, no unique answer exists. If critical for your particular application should try several methods and test coverage. Ø Daring to make general conclusions… Ø q Simple approaches lead to sizable undercoverage or overcoverage o q q If both n 1 and n 2 larger than 2 -3 counts and just want to draw an error bar gaussian error prop. can be enough Some methods are proposed based on the 2 D of independent Poisson, which perform rather well and permit visual interpretations Not surprisingly, profile-based approaches provide more accurate results An expansion-based approach also tested. Looks interesting (especially promising for multiple dimensions), but need a more elegant solution to cope with the zero Quark confinement XIII, August 2018 Francisco Matorras, IFCA, case Spain Ø