Probability Powerball I am offered two Powerball tickets
Probability
Powerball ® I am offered two Powerball tickets: Powerball – Ticket 1: has numbers – Ticket 2: has numbers Which card should I take so that I have the greatest chance of winning lotto?
Roulette In the casino I wait at the roulette wheel until I see a run of at least five reds in a row. I then bet heavily on a black. I am now more likely to win.
Coin Tossing I am about to toss a coin 20 times. What do you expect to happen? Suppose that the first four tosses have been heads and there are no tails so far. What do you expect will have happened by the end of the 20 tosses ?
Coin Tossing • Option A – Still expect to get 10 heads and 10 tails. Since there already 4 heads, now expect to get 6 heads from the remaining 16 tosses. In the next few tosses, expect to get more tails than heads. • Option B – There are 16 tosses to go. For these 16 tosses I expect 8 heads and 8 tails. Now expect to get 12 heads and 8 tails for the 20 throws.
TV Game Show • In a TV game show, a car will be given away. – 3 keys are put on the table, with only one of them being the right key. The 3 finalists are given a chance to choose one key and the one who chooses the right key will take the car. – If you were one of the finalists, would you prefer to be the 1 st, 2 nd or last to choose a key?
Let’s Make a Deal Game Show • You pick one of three doors – two have goats behind them – one has lots of money or a car behind it • The game show host then shows you a goat behind one of the other doors • Then he asks you “Do you want to change doors? ” – Should you? !? Does it matter? !? • See the following website to play the game http: //www. shodor. org/interactivate/activities/Simple. Monty. Hall/
Game Show Dilemma Suppose you choose door A. In which case Monty Hall will show you either door B or C depending upon what is behind each. No Switch Strategy - here is what happens Result A B C Win Car Goat Lose Goat Car P(WIN) = 1/3
Game Show Dilemma Suppose you choose door A, but ultimately switch. Again Monty Hall will show you either door B or C depending upon what is behind each. Switch Strategy - here is what happens Result A B C Lose Car Goat Win Goat Car P(WIN) = 2/3 !!!!
Matching Birthdays • In a room with 23 people what is the probability that at least two of them will have the same birthday? • Answer: . 5073 or 50. 73% chance!!!!! • How about 30? • . 7063 or 71% chance! • How about 40? • . 8912 or 89% chance! • How about 50? • . 9704 or 97% chance!
Probability What are probabilities and where do they come from? Simple probability models – coin flips, die rolls, etc. Sample-based or empirical probabilities Properties of probabilities (events, unions, interactions, etc…) Conditional probabilities and the concept of independence Binomial probability experiments and binomial random variable Applications of the binomial distribution A first look at decision making using binomial probabilities
Different Probability Statements •
Probability •
Probability A customer takes out a small business loan for $200, 000. What is the probability the customer defaults on the loan? What are the possible outcomes? “Default” or “No Default” What is the probability the customer defaults on the loan at some point? ? ? What factors influence this probability? ? ? Logistic Regression (STAT 310 & beyond)
What are Probabilities? • A probability is a number between 0 & 1 that quantifies uncertainty. • A probability of 0 identifies impossibility • A probability of 1 identifies certainty
Where do probabilities come from? • Probabilities from “games of chance”: The probability of getting a four when a fair dice is rolled is P(Rolling a 4) = 1/6 (0. 1667 or 16. 7% chance) The probability of winning craps on a “Pass Line” bet is. 493 The probability of winning the jackpot in a Powerball lottery is. 00006844
Where do probabilities come from? •
Where do probabilities come from? • Subjective Probabilities – The probability that there will be a vaccine for COVID-19 developed within the next year is 0. 60 or 60%. – If the probability of rain in the next 24 hours is very high. Perhaps the weather forecaster or a weather app might say a there is a 80% chance of rain. – A doctor may state your chance of successful treatment, e. g. 70% chance of remission.
Empirical or Sample-Based Probabilities •
Example 3. 2: Hodgkin’s Disease Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 18 74 104 MC 58 54 154 266 NS 12 16 68 96 Column Totals 126 98 314 n = 538
Example 3. 2: Hodgkin’s Disease Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 18 74 104 MC 58 54 154 266 NS 12 16 68 96 98 314 n = 538 Column 126 Totals
Example 3. 2: Hodgkin’s Disease Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 18 74 104 MC 58 54 154 266 NS 12 16 68 96 Column Totals 126 98 314 n = 538
Example 3. 2: Hodgkin’s Disease Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 18 74 104 MC 58 54 154 266 NS 12 16 68 96 Column Totals 126 98 314 n = 538
Example 3. 2: Hodgkin’s Disease Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 18 74 104 MC 58 54 154 266 NS 12 16 68 96 98 314 n = 538 Column 126 Totals
Conditional Probability • We wish to find the probability of an event occurring given information about occurrence of another event. For example, what is probability of developing lung cancer given that we know the person smoked a pack of cigarettes a day for the past 30 years. • Key words that indicate conditional probability are: “given that”, “of those”, “if …”, “assuming that”, “amongst those …”
Conditional Probability •
Independence •
Independence
Ex 2: Minnesota Lottery - Daily 3 ® Here are the winning numbers from Monday, September 21, 2020. What is the probability you win $1000 by matching all three numbers?
Example 3. 3: Rolling a single die •
Example 3: Rolling a single die •
Example 3: Rolling a single die •
Empirical or Sample-based Conditional Probabilities Die rolls, coin flips, playing casino games, etc. are fine, but we are interested in drawing conclusions from data! Most situations when conducting research are not amongst those where we can find exact probabilities! For example, what is the probability an O-ring fails given the temperature is -0. 6 degrees C?
Example 3. 2: Hodgkin’s Disease (cont’d) Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 18 74 104 MC 58 54 154 266 NS 12 16 68 96 Column Totals 126 98 314 n = 538 Conditional Probs
Example 3. 2: Hodgkin’s Disease (cont’d) Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 18 74 104 MC 58 54 154 266 NS 12 16 68 96 Column Totals 126 98 314 n = 538 Conditional Probs
Example 3. 2: Hodgkin’s Disease (cont’d) Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 18 74 104 MC 58 54 154 266 NS 12 16 68 96 98 314 n = 538 Column 126 Totals Conditional Probs
Example 3. 2: Hodgkin’s Disease (cont’d) Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 18 74 104 MC 58 54 154 266 NS 12 16 68 96 98 314 n = 538 Column 126 Totals Conditional Probs
Example 3. 2: Hodgkin’s Disease (cont’d) Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 18 74 104 MC 58 54 154 266 NS 12 16 68 96 98 314 n = 538 Column 126 Totals Conditional Probs
Example 3. 2: Hodgkin’s Disease (cont’d) We can display these and other conditional probabilities of interest using a mosaic plot as we have seen previously.
Example 3. 4: Helmet Use and Head Injuries in Motorcycle Accidents (Wisconsin, 1991) No Helmet (NH) Helmet Worn (H) Column Totals Brain Injury (BI) No Brain Injury (NBI) Row Totals 97 1918 2015 NBI = no brain injury BI = the event the motorcyclist sustains brain injury 17 977 994 H = the event the motorcyclist was wearing a helmet 114 289 3009 NH = no helmet worn What is the probability that a motorcyclist involved in an accident was wearing a helmet?
Example 3. 4: Helmet Use and Head Injuries in Motorcycle Accidents (Wisconsin, 1991) No Helmet (NH) Helmet Worn (H) Column Totals Brain Injury (BI) No Brain Injury (NBI) Row Totals 97 1918 2015 NBI = no brain injury 994 H = the event the motorcyclist was wearing a helmet 17 977 114 2895 BI = the event the motorcyclist sustains brain injury NH = no helmet worn What is the probability that the cyclist sustained brain injury given they were wearing a helmet? P(BI|H) = 17 / 994 =. 0171
Example 3. 4: Helmet Use and Head Injuries in Motorcycle Accidents (Wisconsin, 1991) Brain Injury (BI) No Helmet (NH) Helmet Worn (H) Column Totals 97 No Brain Injury (NBI) 1918 Row Totals BI = the event the motorcyclist sustains brain injury 2015 NBI = no brain injury H = the event the motorcyclist was wearing a helmet 17 977 994 114 2895 3009 NH = no helmet worn What is the probability that the cyclist not wearing a helmet sustained brain injury? P(BI|NH) = 97 / 2015 =. 0481
Example 3. 4: Helmet Use and Head Injuries in Motorcycle Accidents (Wisconsin, 1991) No Helmet (NH) Helmet Worn (H) Column Totals Brain Injury (BI) No Brain Injury (NBI) Row Totals 97 1918 2015 17 977 994 114 2895 3009 How many times more likely is a non-helmet wearer to sustain brain injury? . 0481 /. 0171 = 2. 81 times more likely. This is called the relative risk or risk ratio (denoted RR).
Example 3. 4: Helmet Use and Head Injuries in Motorcycle Accidents (Wisconsin, 1991) The shading for Brain Injury for the No Helmet group is roughly three times higher than the shading for Brain Injury for the Helmet Worn group. Motorcyclists not wearing a helmet are at three times the risk of suffering brain injury.
Relative Risk (RR) vs. Odds Ratio (OR) Example 3. 5: Age at First Pregnancy and Cervical Cancer A case-control study was conducted to determine whethere was increased risk of cervical cancer amongst women who had their first child before age 25. A sample of 49 women with cervical cancer was taken of which 42 had their first child before the age of 25. From a sample of 317 “similar” women without cervical cancer it was found that 203 of them had their first child before age 25. Q: Do these data suggest that having a child at or before age 25 increases risk of cervical cancer?
Odds Ratio (OR) The ODDS for an event A are defined as P(A) Odds for A = _______ 1 – P(A) For example, suppose we roll a single die the odds for rolling a 6 are: Odds for 6 = P(Roll a 6)/(1 – P(Roll a 6)) = (1/6)/(1 – (1/6)) = 1/5 (1: 5 odds for or 5: 1 odds against) i. e. 1 six for every 5 rolls that don’t result in a six.
Odds Ratio (OR) Odds for disease The Odds Ratio (OR) for a disease associated with a riskthose amongst with risk factor is ratio of the odds for disease for those with riskfactor present factor and the odds for disease for those without the risk factor P(Disease|Risk Factor) ___________ 1 – P(Disease|Risk Factor) OR = _____________ P(Disease|No Risk Factor) ____________ 1 – P(Disease|No Risk Factor) Odds for disease The Odds Ratio gives us the multiplicative increase in amongst those without the risk odds associated with having the “risk factor”. factor.
Relative Risk (RR) vs. Odds Ratio (OR) Age at 1 st Pregnancy Cervical Cancer Case Row Control Totals Age < 25 42 203 245 Age > 25 7 114 121 Column Totals 49 317 n = 366 a) Why can’t we calculate P(Cervical Cancer | Age < 25) using these data? Because the number of women with disease was fixed in advance and is therefore NOT RANDOM !
Relative Risk (RR) vs. Odds Ratio (OR) Age at 1 st Pregnancy Cervical Cancer Case Row Control Totals Age < 25 42 203 245 Age > 25 7 114 121 Column Totals 49 317 n = 366 b) What is P(risk factor|disease status) for each group? P(Age < 25|Case) = 42/49 =. 857 or 85. 7% P(Age < 25|Control) = 203/317 =. 640 or 64. 0%
Relative Risk (RR) vs. Odds Ratio (OR) Age at 1 st Pregnancy Cervical Cancer Row Case Control Totals Age < 25 42 203 245 Age > 25 7 114 121 Column Totals 49 317 c) What are the odds for the risk factor amongst the cases? Amongst the controls? Odds for risk factor cases =. 857/(1 -. 857) = 5. 99 Odds for risk factor controls =. 64/(1 -. 64) = 1. 78
Relative Risk (RR) vs. Odds Ratio (OR) Age at 1 st Pregnancy Cervical Cancer Case Row Control Totals Age < 25 42 203 245 Age > 25 7 114 121 Column Totals 49 317 d) What is the odds ratio for the risk factor associated with being a case? Odds Ratio (OR) = 5. 99/1. 78 = 3. 37, the odds for having 1 st child on or before age 25 are 3. 37 times higher for women who currently have cervical cancer versus those that do not have cervical cancer.
Relative Risk (RR) vs. Odds Ratio (OR) Odds Ratio The ratio of dark to light shading is 3. 37 times larger for the cervical cancer group than it is for the control group.
Relative Risk (RR) vs. Odds Ratio (OR) •
Relative Risk (RR) vs. Odds Ratio (OR) g) Finally calculate the odds ratio for disease associated with 1 st pregnancy age < 25 years of age. Odds Ratio =. 207/. 061 = 3. 37 Final Conclusion: Women who have their Thischild is exactly sameage as the odds 3. 37 ratio times for first at orthe before 25 have having the risk factor (Age < 25) associated higher odds of developing cervical cancer with being in the cervical cancer group!!!! when compared to women who had their first child after the age of 25.
Relative Risk (RR) vs. Odds Ratio (OR) Risk Factor Status Risk Factor Present Risk Factor Absent Disease Status Case Control OR = _____ a c b d Much easier computational formula!!!
Relative Risk (RR) vs. Odd’s Ratio (OR) •
Relative Risk (RR) vs. Odds Ratio (OR) Age at 1 st Pregnancy Case Row Control Totals Age < 25 a 42 b 203 245 Age > 25 c 7 d 114 121 Column Totals 49 317 n = 366
Case-Control Studies
Case-Control Studies
Simple Probability Models “The probability that an event A occurs” is written in shorthand as P(A). For equally likely outcomes, and a given event A: Number of outcomes in A P(A) = Total number of outcomes
Example: Three Coin Flips •
Example: Three Coin Flips •
Probability Distribution (Discrete) • 0 . 125 1 . 375 2 . 375 3 . 125
Example: 10 Coin Flips
- Slides: 64