The Normal Distribution Continuous Random Variables 1 The

The Normal Distribution Continuous Random Variables 1

The Normal Distribution � � The fundamental distribution underlying most of inferential statistics is the normal distribution. Normal distributions are a family of symmetric, bell shaped density curves defined by a mean m and a standard deviation s, denoted N(m, s).

A Family of Normal Density Curves Same m different s Different m same s

Properties of the Normal Distribution 1. The curve is symmetric about the mean (i. e. area under the curve to the left of the mean is equal to the area under the curve to the right of the mean). 2. The mean = median = mode. So, the highest point of the curve is at x = μ. 3. The curve has inflection points at (μ – σ) and (μ + σ). 4. The total area under the curve is equal to 1. 5. As x gets larger and larger (in either the positive or negative directions), the graph approaches but never reaches the horizontal axis.

The Empirical Rule � � If the distribution is Normally Distributed (Or just roughly symmetric, bell shaped and unimodal) then: Approximately 68% of the observations will lie within 1 standard deviation of the mean. Approximately 95% of the observations will lie within 2 standard deviations of the mean. Approximately 99. 7% of the observations will lie within 3 standard deviations of the mean.

Empirical Rule Visually

Ways to find Normal Probabilities: � Math: ◦ Recall the Normal PDF: ◦ In order to integrate an important identity is the Gaussian Integral: �Z table and Standardizing: � Technology 7

Measuring the Distance from the Mean � As mentioned previously, we can only compare standard deviations when the means are similar. � If we want to compare any distribution in relation to spread we can change the observations to standard units. � These standard units are called z-scores.

Standard Normal Distribution �A very special, specific normal distribution with μ = 0 and σ= 1. � Now that we know areas under a density curve relate to probabilities, we can use Zscores to determine probabilities on the SND.

Standard Normal (Z) Table � Gives � P(Z � So us the CDF (≤) value of a Z score ≤z) for any Less than probability I can simply use the table value

The Z table for > Probabilities � What � Use � So if we are interested in P(Z>Z) (or ≥)? compliment Rule: P(Z>Z) = 1 - P(Z≤z) for any greater than probability I can use 1 - table value

The Z table for “between” Probabilities � What if we are interested in P(z 1 < Z < z 2)? � Remember left. the table gives you the area to the � P(Z ≤ z) using that fact, if I take the larger area and subtract off the smaller, it should leave me with what is in between � P(z 1 < Z < z 2) = P(Z < z 2)- P(Z < z 1)

The Standardizing Process �

Finding Normal Probabilities 1) Draw your normal curve centered at µ and indicate σ 2) Shade in the desired area. 3) Consider the empirical rule to get an estimate of your answer 4) Find the probability corresponding to the shaded area using the Z table 5) Use Technology to check your work

Finding Quantiles �

Empirical Rule Example Suppose the hours worked by faculty members are normally distributed with an average of 36 hours per week with a standard deviation of 4. � N(36, 4) � What percentage of faculty members work between 32 and 40 hours a week? ◦ 68% � Find the range of hours worked per week by 95% of faculty members. ◦ Between 28 and 44 � What percentage of faculty members work Less than 36 hours a week? ◦ 50% � What percentage of faculty members work Less than 40 hours a week? ◦ 84% � What percentage of faculty members work Less than 42 hours a week? ◦ Empirical rule can give us a ballpark estimate for this, but can't answer it

Examples: P(Z < z) � P(Z < -0. 18) � Using the Table: ◦ P(Z < -0. 18) =0. 4286 � In Excel:

Minitab Instructions � Graph Probability Distribution Plot -> View Single � SND is chosen by default � Click ok � Double Click on the Density Curve, click the “Shaded Area” Tab. � Choose X-value, type in your value and choose your desired area.

Examples: P(Z > z) � P(Z > 2. 09) = 1 – P(Z < 2. 09) � Using the table: ◦ P(Z < 2. 09) = 0. 9817 � 1 – 0. 9817= 0. 0183

Examples P(z 1 < Z < z 2) � This situation deals with finding areas in between two z-values. � P(1. 01 < Z < 2. 02) � Procedure: Find the area to the left of 2. 02 and subtract the area to the left of 1. 01. See images…

Examples P(z 1 < Z < z 2)

Examples P(z 1 < Z < z 2) � P(1. 01 < Z < 2. 02) � P(Z < 2. 02) = 0. 9783 � P(Z < 1. 01) = 0. 8438 � P(1. 01 < Z < 2. 02) = ◦ 0. 9783 – 0. 8438 = 0. 1345 � In Excel:

Standardizing Example (<) �

Standardizing Example (>) �

Standardizing Example (between) �

Normal Quantile Example Suppose we have IQ test results that we know are normally distributed with mean 100 and standard deviation 15. � � Find the minimum score that puts you in the top 5%. � We are looking for P( X>x ) = 0. 05 ◦ � ◦ � In other words P( X<x ) = 0. 95 According to our Table The closest Z score to 0. 95 is 1. 645 We must un-standardize: X = 1. 645(15)+100 = 124. 675 In Excel:

Normal Quantile Example 2 � � � Suppose we have IQ test results that we know are normally distributed with mean 100 and standard deviation 15. Find the two scores that would put you in the middle 50% (min score and max score). We are looking for P( a<X<b ) = middle. 5 ◦ In other words P( a<X ) = 0. 25 & P(X<b ) = 0. 75 According to our Table ◦ The closest Z score to 0. 75 is 0. 67 ◦ The closest Z score to 0. 25 is -0. 67 We must un-standardize x 2: ◦ X = 0. 67(15)+100 = 89. 88 ◦ X = 0. 67(15)+100 = 110. 12 In Excel: