2021 10 15 Biostatistics for the biomedical profession

2021 -10 -15 Biostatistics for the biomedical profession Lecture 5 - Repetition BIMM 18 Karin Källen & Linda Hartman October 2015 1

• Repetition • Lecture 1: summary measures and graphical methods • Lecture 2 -3: Normal distribution, generalisation, confidence interval, reference interval, t-test, ANOVA • Paired samples t-test, Non-parametric tests (Mann. Whitney’s test, Kruskal-Wallis’ test, Wilcoxon signed rank test) • Lecture 4: Linear regression, R 2, correlation 2021 -10 -15 Today 2

• Repetition • Lecture 1: summary measures and graphical methods • Lecture 2 -3: Normal distribution, generalisation, confidence interval, reference interval, t-test, ANOVA • Paired samples t-test, Non-parametric tests (Mann. Whitney’s test, Kruskal-Wallis’ test, Wilcoxon signed rank test) • Lecture 4: Linear regression, R 2, correlation 2021 -10 -15 Today 3

Types of data Binary Nominal Quantitative Ordinal Discrete Continuous 2021 -10 -15 Categorical Discrete variables with only a few possible values are often analysed with the same methods as for ordinal variables. Discrete variables with many possible values are often analysed with the same methods as for continuous variables. 4

• Categorize the following measurements in binary/nominal/ordinal/discrete/continuous 1. 2. 3. 4. 5. 6. Vitamin D levels (μg/ml) Delivery mode (Cesarean section, vaginal, forceps, vacuum extractor) Heart disease (yes/no) Birth weight # Cells in a sample day 3 Educational level (elementary school, high school, college education, post-graduate education) Continuous Nominal 2021 -10 -15 Types of data - exercise Binary Continuous Discrete Ordinal 5

• Central Tendency measures. Describe a “center” around which the measurements in the data are distributed 2021 -10 -15 Summary measures • Dispersion (or Variability) measures. Describe “data spread” or how far away the measurements are from the center. 6

Central tendency, Mean or median The choice depends on the distribution of the data: Symmetric distribution 2021 -10 -15 • Symmetric data • Asymmetric data • Ordinal data Asymmetric distribution (positive skew) 7

Exercise: Fill out the appropriate central tendency measure Type of data Central tendency measure Symmetric data Mean Asymmetric data Median Ordinal Median Nominal 2021 -10 -15 Central tendency measures Summary - 8

Measures of dispersion if based on mean 2021 -10 -15 • 9

• Compute the standarddeviation of your, and your two neighbours shoe sizes. • Example: 38, 40, 42 • Mean=(38+40+42)/3=40 • S= √ 3 -1 2021 -10 -15 Excercise = √ ½ * (4+0+4)= √ 4 = 2 Exercise: What is the corresponding variance? V=s*s V=2*2=4 10

Measures of dispersion – not based on mean Exercise: 1. How large proportion of the population wil be above the 10% percentile? 2021 -10 -15 • Min & max • Range=max-min • Quartiles (QL at 25% and QU at 75%) – Limits that splits the data set in quarters • Interquartile range (IQR) IQR = QU- QL • Percentiles Limits that splits the data set in fixed proportions (e. g. 5%, 10% 20%, 80%, 95%) 90% 2. How large proportion of the Population will be between QL and QU? 50% 11

Robustness Robust to extreme observations 2021 -10 -15 Highly skewed data, Fig 3 -10 Sensitive to extreme observations Measure WITHOUT largest WITH largest observation Mean 3. 9 6. 1 Median 4 4 Range 5 42 SD 1. 4 9. 3 QL; QU 3; 5 Use Median & Quartiles for skewed data + Graphical presentation! 12

Excercise, continued: Fill in the table: Type of data Central tendency measure Dispersion measure Symmetric data Mean Asymmetric data Median Percentiles (e. g. QL and QU ) Ordinal Median Percentiles (e. g. QL and QU ) Nominal - 2021 -10 -15 Summary: Summary measures, Standard deviation 13

Grafical presentation Bar plot 2021 -10 -15 Histogram Box plot Whisker Highest ”normal” value (Inner fence) Upper quartile, QU Median Lower quartile, QL Whisker Lowest ”normal” value Outliers Normal value: Value within 1. 5 x interquartile range 14

• Repetition • Lecture 1: summary measures and graphical methods • Lecture 2 -3: Normal distribution, generalisation, confidence interval, reference interval, t-test, ANOVA • Paired samples t-test, Non-parametric tests (Mann. Whitney’s test, Kruskal-Wallis’ test, Wilcoxon signed rank test) • Lecture 4: Linear regression, R 2, correlation 2021 -10 -15 Today 15

• • • The mean, median, and mode all have the same value The curve is symmetric around the mean; the skew and kurtosis is 0 The curve approaches the X-axis asymptotically Mean ± 1 SD covers 2*34. 1% of data Mean ± 2 SD covers 2*47. 5% of data Mean ± 3 SD covers 99. 7% of data 2021 -10 -15 The (perfect) normal distribution 16

1. Estimate the value corresponding to a z-score of -2 and +2, respectively, for birth length, based on the following sample measurements: • -2 z-scores: 50. 5 – 2*2. 3 = 45. 9 cm Mean: SD: 50. 5 cm 2. 3 cm • +2 z-scores: 50. 5 + 2*2. 3 = 55. 1 cm 2021 -10 -15 Excercise: 2. Estimate the z-score corresponding to a birth length of 48 cm. • (48 -50. 5)cm / 2. 3 cm = - 1. 1 17

Birth length example, continued 95% of all Swedish babies will have a birth lenght betwen 45. 9 and 55. 1 cm Exercise: How large proportion of the population will have a birth length above 48 cm? • Hint: approximate that 70% av all will be between +/- 1. 1 SD • Approx 85% 2021 -10 -15 A birth lenght of 48 cm corresponds to a z-value of -1. 1. 18

The standard error of the mean is a measurement of the spread of the mean. . 2021 -10 -15 Standard error of the mean (SEM) Let SD= the population standard deviation n=sample size Then SEM= 19

�A confidence interval tells us within which interval the ’true’ estimate of a parameter probably lies�E. g. , a 95% confidence interval tells us between which limits the ’true’ estimate (with 95% certainty) lies. 2021 -10 -15 Confidence interval �Repetition: 95% of the data will lie between +/- 2 SD (1. 96 exactly). �A 95% CI could be constructed (assuming large sample): (mean-1. 96*SEM to mean+1. 96*SEM) 20

1. Estimate the value corresponding to a z-score of -2 and +2, respectively, for birth length, based on the following sample measurements: • -2 z-scores: 50. 5 – 2*2. 3 = 45. 9 cm Mean: 50. 5 cm • +2 z-scores: 50. 5 + 2*2. 3 = 55. 1 cm SD: 2. 3 cm 2. Estimate the z-score corresponding to a birth length of 48 cm. • (48 -50. 5)cm / 2. 3 cm = - 1. 1 3. How large proportion (approximately) of the infants will have a birth below 48 cm? • Approx 15% 2021 -10 -15 Excercise, birth lengths, continued: 4. The mean (50. 5 cm) was based on 77 000 births. A)Construct a 95% CI for mean birth lenght in this sample • 50. 5 +/- 1. 96 * SEM where SEM=2. 3/√ 70 000 = 0. 0087 • Mean (95%CI)= 50. 5 (50. 48 – 50. 52) B)Construct a 95% reference interval for birth length 50. 5 +/- 1. 96*SD: (46. 0 – 55. 0)cm 21

2021 -10 -15 The 95%CI are often represented by error bars 22 Statistical inference: Is there a true difference between the means?

Hypothesis testing • In order to get significant findings, we want to reject H 0 ex. no effect If H 0 is rejected, the alternative hypothesis is left (H 1) • The p-value is the probability that you get that result you got or even more extreme if H 0 is true. • P-value is a probability between 0% and 100% 2021 -10 -15 • Set up a null hypothesis (H 0): No effect • Set up an alternative hypothesis (H 1): Effect � If the p-value is small enough: reject H 0 � Small enough is something you decide before the analysis, significance level Ex. 1%, 5% or 10% � Calculation of the p-value can be done even if the data is not normally distributed, but in different ways 23

• The p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis (H 0) is true. 2021 -10 -15 Hypothesis testing Elements of statistical inference • Type I error (often referred to as alpha) is the probability of rejecting H 0 when in fact H 0 is true. • Type II error (often referred to as beta) is the probability of accepting H 0 when in fact H 0 is false. 24

1. Imagine that your null-hypothesis H 0 was true, and that you ran a statistical test with α=type I error = 0. 05 • What is the probability of rejecting the nullhypothesis? I. e. Finding the effect significant? P(rejecting H 0)= α=0. 05 2. If you ran 5 tests on different data sets • What is the probability finding at least one significant result ? (Assuming H 0 was true = no effect) P(not rejecting H 0)= 1 - α=0. 95 P(not rejecting each of 5 H 0) = 0. 955= 0. 77 P(rejecting 1 H 0 or more) = 1 - P(not rejecting each of 5 H 0) = 0. 23 3. If you ran K tests on different data sets As above, • What is the probability finding at least one P(rejecting 1 H 0 or more) = 1 significant result ? (Assuming H 0 was true = no (1 – α)K effect) Hint (2, 3): calculate the probability that H 0 iwas not rejected in any of the tests 2021 -10 -15 Multiple testing - exercise 25

What is the probability of finding at least one significant result in K tests ? P(rejecting 1 H 0 or more) = 1 - (1 –α)K ≈ α*K 2021 -10 -15 Multiple testing – Bonferroni correction K=5: 1 - (1 –α)K = 0. 23 ≈ 0. 05*5 = 0. 25 Bonferroni: Compare p of each test to α*K to control the family wise alpha level But: • Overly conservative -> increase type II error • Do not use for K>5 • Use more sophisticated methods: Holm, Hochberg, Tukeys HSD etc 26

Statistical significance vs. clinical relevance Statistical significance: ”There is a difference” 2021 -10 -15 • Low p-value • How large is the difference? Clinical relevance: ”Is the difference of importance? ” Effect estimation (CI) is needed! 27

Answer: 1. E or B 2. B 3. C 4. A 5. D No effect 2021 -10 -15 Exercise – statistical inference Clinically relevant effect 95% Confidence intervals around the effect measure, and p-values for the null hypothesis of ”no effect” in 5 investigations Make pairs of the statements and study results (A-E) in the Figure 1. Treatment effect cannot be detected, but cannot be ruled out 2. A clinically relevant effect is indicated, but is statistically uncertain 3. Treatment effect is statistically significant, uncertain if the effect is of clinical relevance 4. Clinically relevant effect that is statistically significant 5. Treatment effect is statistically significant, but a clinically relevant difference can be ruled out Fr Jonas Björk: Praktisk statistik för medicin och hälsa 28

Population distribution N=10 Distribution of means from 1000 samples, by different sample sizes 2021 -10 -15 Difference between distribution in the population, and distribution of means of samples, by sample sizeifferent sizes N=50 N=100 29

Uses the fact that the difference between two means from normal distributions, will follow a normal distribution with expected mean=0, and expected standard deviation SE. t= 2021 -10 -15 The t-test Difference between group means = x 1 - x 0 SE Standard error of difference in means SE=s√(1/n 0+1/n 1); s= √[ (n 1 -1)s 12 +(n 0 -1)*s 02. (n 1 + n 0 – 2) ] The test statistic (t) could then be compared with the student’s t- distribution. If the samples are large, the test statistic will follow a normal distribution (small samples will follow a t(df) distribution, where df=n 1+n 2 -2). 30

T-distribution N-1=”degrees of freedom” Degrees of freedom t-constant for 95% CI 5 2. 57 9 2. 26 19 2. 09 29 2. 02 49 2. 01 99 1. 98 1. 96 2021 -10 -15 • 31

Repetition: T-test 1. The mean is a relevant summary measure 2. Independent observations (e. g. no patient contributes more than one observation) 3. Observations are of Normal distribution OR Both groups are large 2021 -10 -15 Assumptions Exercise: Would you perform a T-test if you, e. g. , would like to compare the cotinine levels in children to smokers vs non-smokers? Doubtful…. (does not meet the first criteria) Other solutions? Perform a non-parametric test (e. g. , Mann-Whitney) 32

2021 -10 -15 The SPSS-output from a non-parametric test: (cotinine levels in children in mothers of smokers vs non-smokers) Exercise: How could you interpret the output? Do you need to do more to show the results? Present medians, perhaps percentiles or inter-quartile-range, or boxplot Non-smokers: Smokers: Median Interquartile Range . 13. 29 Median Interquartile Range 76. 1 86. 2 33

2021 -10 -15 Box-plot: (cotinine levels in children in mothers of smokers vs non-smokers) 34 Exercise: Describe the distributions by studying the box-plot

Test-situation Non-parametric test Independent samples, 2 groups Mann-Whitney Independent samples, ≥ 2 groups Kruskal-Wallis Paired samples, 2 groups Wilcoxon signed rank test 2021 -10 -15 Non-parametric tests 35

2021 -10 -15 The SPSS-output from a T-test (against better knowledge). Cotinine levels in children of smokers compared to non-smokers. Equal variances could not be assumed. Read from the lower row. A significant difference between the means… …but the means (and thus, the difference between the means) do not make sense. 36

• Test variable: D = Mean in group B - Mean in group A • H 0: D = 0, Mean in group A = Mean in group B • H 1: D 0, Mean in group A Mean in group B • Construct a confidence interval for D and/or Calculate the p-value Is there a difference? 2021 -10 -15 Test procedure for t-test How large is the difference? 37

H 0: Sample mean(μ 1) = Expected mean(μ 0) H 1: Sample mean(μ 1) ≠ Expected mean(μ 0) Expected mean, 0 • Z-value below -1. 96 or above +1. 96 Difference between sample means, μ 1 p(μ 1 from distribution 0) =(. 0249)=. 05 • Unlikely that μ 1 belongs to the same distribution • H 0 is rejected -1. 96 0 1. 96 μ 1 2021 -10 -15 Hypothesis testing, two-sided, large samples Is there a difference? 38

2021 -10 -15 T- test Example: Birth weight, Descriptives 39

T- test Example: How large is the difference? Construct a 95%CI for mean difference (D) between groups Pooled s (If equal variances could be assumed): c = 1. 96 if large samples, otherwise c can be obtained from the T-distribution (DF), where DF=n 1 -n 0 -2. 2021 -10 -15 95% CI: D +/- c * SE Example n. A - 1 + n. B - 1 = 247 degrees of freedom c 1. 97 (obtained from statistical table for the t-distribution) Exercise: Compute a 95% CI for the mean difference 154 +/- 1. 97*76 Mean with 95%CI: 154 ( 4 – 304 ) 40

T-test for two independent groups • Two different test versions depending on if equal standard deviation (variance) can be assumed or not 2021 -10 -15 Example of SPSS-output Slightly different P-values for the t-tests from previous slide due to Levene’s test: p-value (”Sig. ”) testing H 0: Variance in A = Variance in B round-up-errors 41

Multiple T-tests could result in mass-significance! Do ANOVA instead of repeated T-tests. • • • ANOVA: H 0: Mean 1=mean 2=mean 3 H 1: At least two of the means are different • In short: In an ANOVA, the total variance is devided into the within-groups, and between-groups variance. 2021 -10 -15 More than two groups, one way ANOVA (analysis of variance) 42

2021 -10 -15 Example: ANOVA – to compare the birth weight between 4 parity groups 43

2021 -10 -15 Example: ANOVA birth weight and parity, continued Significance of the test: Are all the means the same? (m 1=m 2=m 3=m 4) To check for pair-wise differences, post-hoc test could be performed 44

2021 -10 -15 Example: ANOVA – to compare the birth weight between 4 parity groups 45

Different methods to adjust for multiple comparison 2021 -10 -15 Post hoc tests 46

T-test for paired data Controls. Day 2 915600 953300 650000 700000 1050000 984000 772000 920000 1080000 920000 840000 533000 510000 722000 Sal. Day 2 357800 502200 470000 560000 736000 556000 418000 600000 680000 520000 560000 620000 704000 696000 Two ways of comparing means: 1. Calculate the means of the groups, and estimate the difference 2. Estimate the difference for each row. Then calculate the mean of the differences 2021 -10 -15 Preparation 11 2 3 4 5 6 7 8 9 10 11 12 13 14 Means Difference between values 47

Preparation 11 2 3 4 5 6 7 8 9 10 11 12 13 14 Means Controls. Day 2 915600 953300 650000 700000 1050000 984000 772000 920000 1080000 920000 840000 533000 510000 722000 824992, 8571 Sal. Day 2 357800 502200 470000 560000 736000 556000 418000 600000 680000 520000 560000 620000 704000 696000 570000 Difference between values 557800 451100 180000 140000 314000 428000 354000 320000 400000 280000 -87000 -194000 26000 2021 -10 -15 T-test for paired data 48

Preparation 11 2 3 4 5 6 7 8 9 10 11 12 13 14 Means Controls. Day 2 915600 953300 650000 700000 1050000 984000 772000 920000 1080000 920000 840000 533000 510000 722000 824992, 8571 Sal. Day 2 357800 502200 470000 560000 736000 556000 418000 600000 680000 520000 560000 620000 704000 696000 570000 Difference between values 557800 451100 180000 140000 314000 428000 354000 320000 400000 280000 -87000 -194000 26000 254992, 9 Difference between means= mean of the differences 2021 -10 -15 T-test for paired data 49

Preparation 11 2 3 4 5 6 7 8 9 10 11 12 13 14 Means Controls. Day 2 915600 953300 650000 700000 1050000 984000 772000 920000 1080000 920000 840000 533000 510000 722000 824992, 8571 Sal. Day 2 357800 502200 470000 560000 736000 556000 418000 600000 680000 520000 560000 620000 704000 696000 570000 Difference between values 557800 451100 180000 140000 314000 428000 354000 320000 400000 280000 -87000 -194000 26000 254992, 9 s= 181454, 0097 111808 216636, 9 s (pooled) SEM 150709, 3394 56962, 77603 57898, 64 2021 -10 -15 T-test for paired data 50

Preparation 11 2 3 4 5 6 7 8 9 10 11 12 13 14 Means Controls. Day 2 915600 953300 650000 700000 1050000 984000 772000 920000 1080000 920000 840000 533000 510000 722000 824992, 8571 Sal. Day 2 357800 502200 470000 560000 736000 556000 418000 600000 680000 520000 560000 620000 704000 696000 570000 Difference between values 557800 451100 180000 140000 314000 428000 354000 320000 400000 280000 -87000 -194000 26000 254992, 9 s= 181454, 0097 111808 216636, 9 Thus, the mean is not influenced on whether the data are paired or not, but the estimate of the standard deviation is likely to differ with method. 2021 -10 -15 T-test for paired data Use analyses for paired data when adequate! s (combined) SEM 150709, 3394 56962, 77603 57898, 64 51

Test-situation Independent samples, 2 groups Parametric test Non-parametric test 2021 -10 -15 Excercise: Fill in the names of appropriate tests Independent samples, ≥ 2 groups Paired samples, 2 groups 52

Test-situation Parametric test Non-parametric test Independent samples, 2 groups T-test Mann-Whitney Independent ANOVA samples, ≥ 2 groups Kruskal-Wallis Paired samples, 2 groups Wilcoxon signed rank test Paired t-test 2021 -10 -15 Comparison of different tests 53

• Repetition • Lecture 1: summary measures and graphical methods • Lecture 2 -3: Normal distribution, generalisation, confidence interval, reference interval, t-test, ANOVA • Paired samples t-test, Non-parametric tests (Mann. Whitney’s test, Kruskal-Wallis’ test, Wilcoxon signed rank test) • Lecture 4: Linear regression, R 2, correlation 2021 -10 -15 Today 54

Excercise: Give the equation for the line in the figure 12 2021 -10 -15 10 Y=10 -2*x 8 6 4 2 0 0 1 2 3 4 5 55

Linear regression – The model a = intercept b = slope 2021 -10 -15 Y = a + b. X + e e = Residual variability in the dependent variable Y, not explained by the model. 56

Regression output in SPSS, Brain weight, cont. X = Body weight (g) Y = Brain weight (g) 2021 -10 -15 Model: Y = a + b. X + e 57

Brainweight= 0. 336 + 0. 01*bodyweight + e 2021 -10 -15 Interpretation of regression output Discuss: • According to this model: How much does the brain weight increase with body weight? • What is the (meaningless) interpretation of the estimated constant a = 0. 336? 58

2021 -10 -15 Model fit - R 2 56% of the variance in brain weight is explained by body weight (Variance of Y = SD(Y)2) R 2 = The squared Pearson correlation between X and Y 59

r=0. 98 ”Pearson-correlation” r=0. 87 2021 -10 -15 Correlation – Brain weight cont. A measure of linear relationship -1 ≤ r ≤ 1 r = -1 r = +1 r=0 strongest possible negative relationship strongest possible positive relationship no linear relationship 60

• Repetition • Lecture 1: summary measures and graphical methods • Lecture 2 -3: Normal distribution, generalisation, confidence interval, reference interval, t-test, ANOVA • Paired samples t-test, Non-parametric tests (Mann. Whitney’s test, Kruskal-Wallis’ test, Wilcoxon signed rank test) • Lecture 4: Linear regression, R 2, correlation d o o g y r e v a u o y h s i w I luck ! 2 2 r e b o t c o d l on O l a l l i w u o y t a h t s s I gue well! 2021 -10 -15 Today 61