An Answer and a Question Limits Combining 2

An Answer and a Question Limits: Combining 2 results Significance: Does 2 give 2? Roger Barlow BIRS meeting July 2006

Revisit s+b • Calculator (used in Ba. Bar) based on Cousins and Highland: frequentist for s, Bayesian integration for and b • See http: //www. slac. stanford. edu/~barlow/java/statistics 2. html and C. P. C. 149 (2002) 97 • 3 different priors (uniform in , 1/ , ln ) 2

Combining Limits? With 2 measurements x=1. 1 0. 1 and x=1. 2 0. 1 the combination is obvious With 2 measurements x<1. 1 @ 90% CL and x<1. 2 @ 90% CL all we can say is x<1. 1 @ 90% CL 3

Frequentist problem Given N 1 events with effcy 1 , background b 1 N 2 events with effcy 2 , background b 2 (Could be 2 experiments, or 2 channels in same experiment) For significance need to calculate, given source strength s, probability of result {N 1 , N 2 } or less. 4

What does “Or less” mean? • Is (3, 4) larger or smaller than (2, 5) ? N 2 ? ? Less More ? ? N 1 5

Constraint If 1 = 2 and b 1=b 2 then N 1+N 2 is sufficient. So cannot just take lower left quadrant as ‘less’. (And the example given yesterday is trivial) 6

Suggestion • Could estimate s by maximising log (Poisson) likelihood -( i s +bi) + Ni ln ( i s +bi) Hence Ni i /( i s +bi) - i =0 • Order results by the value of s they give from solving this • Easier than it looks. For a given {Ni } this quantity is monotonic decreasing with s. Solve once to get sdata , explore s space generating many {Ni } : sign of Ni i /( i sdata +bi) - i tells you whether this estimated s is greater or less than sdata 7

Message • This is implemented in the code – ‘Add experiment’ button (up to 10) • Comments as to whether this is useful are welcome 8

Significance Analysis looking for bumps Pure background gives 2 old of 60 for 37 dof (Prob 1%). Not good but not totally impossible Fit to background+bump (4 new parameters) gives better 2 new of 28 Question: Is this significant? Answer: Yes Question: How much? Answer: Significance is ( 2 new - 2 old ) = (60 -28)=5. 65 Schematic only!! No reference to any experimental data, real or fictitious Puzzle. How does a 3 sigma discrepancy become a 5 sigma discovery? 9

Justification? • ‘We always do it this way’ • ‘Belle does it this way’ • ‘CLEO does it this way’ 10

Possible Justification Likelihood Ratio Test a. k. a. Maximum Likelihood Ratio Test If M 1 and M 2 are models with max. likelihoods L 1 and L 2 for the data, then 2 ln(L 2 / L 1) is distributed as a 2 with N 1 - N 2 degrees of freedom Provided that 1. M 2 contains M 1 2. Ns are large 3. Errors are Gaussian 4. Models are linear 11

Does it matter? • Investigate with toy MC • Generate with Uniform distribution in 100 bins, <events/bin>=100. 100 is large and Poisson is reasonably Gaussian • Fit with – – Uniform distribution (99 dof) Linear distribution (98 dof) Cubic (96 dof): a 0+a 1 x + a 2 x 2 + a 3 x 3 Flat+Gaussian (96 dof): a 0+a 1 exp(-0. 5(x- a 2)2/a 3 2) Cubic is linear: Gaussian is not linear in a 2 and a 3 12

One ‘experiment’ Flat +Gauss linear Cubic 13

Calculate 2 probabilities of differences in models Compare linear and uniform models. 1 dof. Probability flat Method OK Compare cubic and uniform models. 3 dof. Probability flat Method OK Compare flat+gaussian and uniform models. 3 dof. Probability very unflat Method invalid Peak at low P corresponds to large 2 i. e. false claims of significant signal 14

Not all parameters are equally useful Shows 2 for flat+gauss v. cubic Same number of parameters Flat+gauss tends to be lower If 2 models have the same number of parameters and both contain the true model, one can give better results than the other. This tells us nothing about the data Conclude: 2 does not give 2? 15

But surely… • In the large N limit, ln L is parabolic in fitted parameters. • Model 2 contains Model 1 with a 2=0 etc. So expect ln L to increase by equivalent of 3 in chi squared. Question. What is wrong with this argument? Asymptopic? Different probability? Or is it right and the previous analysis is wrong? 16