HLSTRECL 3 P 07 Estimating the Confidence Interval

  • Slides: 28
Download presentation
HLST/RECL 3 P 07 Estimating the Confidence Interval for a Sample Mean © W.

HLST/RECL 3 P 07 Estimating the Confidence Interval for a Sample Mean © W. J. Montelpare, Ph. D. 1/15/2022 1

Re-consider our sampling rules u In research, we often collect information from selected individuals.

Re-consider our sampling rules u In research, we often collect information from selected individuals. u Since the procedures for information collection cost money, researchers are forced to make the assumption that the selected individuals are a good representation of a larger group. u In quantitative analyses we call the larger group the population, (represented by the letter 'N'), and we call the selected subgroup the sample (represented by the letter 'n'). © W. J. Montelpare, Ph. D. 1/15/2022 2

Most Important! u If we assume that our sample represents a population, then we

Most Important! u If we assume that our sample represents a population, then we must also assume that any computations, estimates, or inferences based on the numbers from the sample, must also represent the population from which the sample was selected © W. J. Montelpare, Ph. D. 1/15/2022 3

u As such, the average score computed for the sample is assumed to represent

u As such, the average score computed for the sample is assumed to represent the average score for the population; u Likewise, the variablility of scores within the sample (the subgroup) should represent the variability of the scores within the population (the larger group); and u The standardized estimate of the differences computed for the sample should represent the standardized estimate of differences computed for the population. © W. J. Montelpare, Ph. D. 1/15/2022 4

Therefore: µ = sample mean error) ± (sampling where: • µ refers to the

Therefore: µ = sample mean error) ± (sampling where: • µ refers to the measure of central tendency for the population • sample mean refers to the measure of central tendency in the sample • error due to randomness © W. J. Montelpare, Ph. D. 1/15/2022 5

Two basic assumptions u First, we assume that the sample mean is only our

Two basic assumptions u First, we assume that the sample mean is only our best estimate of the true population mean. u Second, we assume that the chance associated with the sample mean's ability to represent the true population mean is dependent upon the ability of the sample scores to represent the population scores. u So that by adding or subtracting the sampling error to or from the sample mean we will be able to identify the range within which the true population estimate falls. © W. J. Montelpare, Ph. D. 1/15/2022 6

Determine the width of the sampling error u The term sampling error refers to

Determine the width of the sampling error u The term sampling error refers to the errors made in the collection of the data. u Sampling error is expected and should thus be accounted for in the computation of the estimates which represent the data. u Researchers state that the estimates (measures of central tendancy, frequencies, or ratio estimates) produced from a selected sample are expected to represent the true population estimate within a specific range. © W. J. Montelpare, Ph. D. 1/15/2022 7

u For example, the researcher states that: They are 95% confident that the sample

u For example, the researcher states that: They are 95% confident that the sample mean represents the true population mean within 10% error. u Typically, researchers indicate that they would like to be at least 95% confident that the sample mean is an estimate of the population mean. Therefore, the researcher is suggesting that 19 out of 20 times the sample mean ± sampling error will include [µ] © W. J. Montelpare, Ph. D. 1/15/2022 8

u The confidence interval is based on the following relationship between the sample mean

u The confidence interval is based on the following relationship between the sample mean and the true population mean or [µ]: lower limit of sample mean < µ < upper limit of sample mean u This sentence is read as: The lower limit of the sample mean is less than the true population estimate which is less than the upper limit of the sample mean. © W. J. Montelpare, Ph. D. 1/15/2022 9

Unpacking the concept of a 95% C. I. So how does 95% fit into

Unpacking the concept of a 95% C. I. So how does 95% fit into this computation. Given that this exercise is essentially to demonstrate to the research community how the set of sample scores are associated with the true set of population scores, then we need to find some way of relating the sample distribution to the population distribution (or how is the set of scores for the sample related to the set of scores for the population). © W. J. Montelpare, Ph. D. 1/15/2022 10

Unpacking the concept of a 95% C. I. u One way to illustrate such

Unpacking the concept of a 95% C. I. u One way to illustrate such a relationship is to standardize the scores for both the sample distribution and the population distribution. u In statistics when we want to standarize an estimate we typically relate the estimate to a device called the normal curve. © W. J. Montelpare, Ph. D. 1/15/2022 11

The Normal Curve u The normal curve is a graphical representation of the standard

The Normal Curve u The normal curve is a graphical representation of the standard normal distribution (ie. the frequency distribution graph of an expected distribution of scores within a "normal population"). By using the normal curve, researchers can describe how closely their sample distribution represents a population distribution. © W. J. Montelpare, Ph. D. 1/15/2022 12

Unpacking the Normal Curve u Understanding the role of the normal curve is important

Unpacking the Normal Curve u Understanding the role of the normal curve is important to inferential statistics. u The normal curve is a graphical presentation of the frequency distribution for a set of standardized (or adjusted) scores. u The standardized scores are ratio scores based on the difference between any score within a set of scores and the measure of central tendency for that set of scores, divided by the standardized error attributed to that set of scores. © W. J. Montelpare, Ph. D. 1/15/2022 13

A sample data set to compute z scores xi (xi-x) 13 19 30 43

A sample data set to compute z scores xi (xi-x) 13 19 30 43 51 156 13 19 30 43 51 - 31. 2 0 xi= 156 � (xi-x)2 = -18. 2 331. 24 = -12. 2 148. 84 = -1. 2 1. 44 = 11. 8 139. 24 = 19. 8 392. 04 1012. 8 mean = 156/5 = 31. 2 (xi-x)= 20 � �xi-x)2 1012. 8= 253. 2 ( = n-1 5 -1 s= s =¦ 253. 2 = 15. 91 © W. J. Montelpare, Ph. D. 1/15/2022 14

u The set of standard scores computed from the original observations is also called

u The set of standard scores computed from the original observations is also called the set of "z" scores and can be computed by the following formula: (xi - x ) z= standard deviation u Therefore, for any set of numbers we could create a set of z scores or standard scores. © W. J. Montelpare, Ph. D. 1/15/2022 15

Use the mean, and s to compute z (xi-x)2 xi (xi-x) z=(xi-x)/s 13 19

Use the mean, and s to compute z (xi-x)2 xi (xi-x) z=(xi-x)/s 13 19 30 43 51 156 13 19 30 43 51 - 31. 2 (xi-x) s z i= = -18. 2 = -12. 2 = -1. 2 = 11. 8 = 19. 8 0 331. 24 148. 84 1. 44 139. 24 392. 04 1012. 8 13 -31. 2/15. 91 19 -31. 2/15. 91 30 -31. 2/15. 91 43 -31. 2/15. 91 51 -31. 2/15. 91 = -1. 14 = -0. 77 = -0. 08 = 0. 74 = 1. 25 The range of z scores in this sample is from-1. 15 to 1. 25 © W. J. Montelpare, Ph. D. 1/15/2022 16

. . and likewise, u For any set of z scores a percentile estimate

. . and likewise, u For any set of z scores a percentile estimate can be attributed to each z score. u This has been shown several times and is commonly known as the Z table of estimates or the Table for the normal curve. u Conversely then for any percentile we could determine a standardized estimate or z score. That is, we could determine the z score for a percent of confidence such as the 95% confidence value. © W. J. Montelpare, Ph. D. 1/15/2022 17

To read the table of the normal curve we proceed through the following stepwise

To read the table of the normal curve we proceed through the following stepwise procedures: i) determine how confident you want to be that the true population mean is captured by the range of the sample mean. For example, 95% confident. ii) divide the selected confidence value by 100 to eliminate the per cent value. 95/100 = 0. 95 iii) divide the quotient by 2 (the 2 is used to designate that you are interested in a two-tailed test), as in the following example using 95%: 0. 95/2 = 0. 4750 © W. J. Montelpare, Ph. D. 1/15/2022 18

iv) find the value (i. e. 0. 475) in the normal distribution table (also

iv) find the value (i. e. 0. 475) in the normal distribution table (also called z table or table of z scores) and move to the left column to identify the prefix of the Z score, then move to the top row of the table to find the trailing numbers of THE Z SCORE. The score you are compiling from the left vertical column combined with the top horizontal row is called the z score attributed to the given % confidence value. For example, if we wish to identify the z score associated with a 95% confidence interval work through the following steps with the table: 95/100 = 0. 95/2 = 0. 4750 © W. J. Montelpare, Ph. D. 1/15/2022 19

u go to the table and find the number 0. 4750 z. 07 0.

u go to the table and find the number 0. 4750 z. 07 0. 0 0. 1 0. 2 . 00. 08 . 01. 09 . 02 . 03 . 04 . 05 . 06 . . . 1. 9 0. 475 Once we identify the number within the table look to the left column to find the prefix of the z score (e. x. 1. 9); then look to the horizontal row across the top of the table to find the trailing numbers (in this example the trailing numbers are. 06). when we combine 1. 9 with 0. 6 we see a z score of 1. 96. The z score for 95% C. I. is 1. 96. © W. J. Montelpare, Ph. D. 1/15/2022 20

You may be asked to find the z score associated with the two tailed

You may be asked to find the z score associated with the two tailed alpha coefficient. This is the same as asking you to find the two tailed z score associated with a 95% confidence level as you have just done. The term alpha coefficient is computed by subtracting the % confidence level from 1 as in the following example. © W. J. Montelpare, Ph. D. 1/15/2022 21

given: 95% confidence 95/100 =. 95 alpha coefficient = 1 -. 95 =. 05

given: 95% confidence 95/100 =. 95 alpha coefficient = 1 -. 95 =. 05 In this example the formula for confidence interval may be written as: µ= error) © W. J. Montelpare, Ph. D. sample mean ± 1/15/2022 zalpha/2 * (sampling 22

The basic premise of estimation and confidence intervals: µ = sample mean error) ±

The basic premise of estimation and confidence intervals: µ = sample mean error) ± (sampling where: • µ refers to the measure of central tendency for the population • sample mean refers to the measure of central tendency in the sample • error due to randomness © W. J. Montelpare, Ph. D. 1/15/2022 23

Estimates in the sample vs. estimates in the population X s 2= �xi-x)2 (

Estimates in the sample vs. estimates in the population X s 2= �xi-x)2 ( n-1 µ (xi-µ)2 � 2= N © W. J. Montelpare, Ph. D. 1/15/2022 24

u Our formula for computing the confidence interval includes the z score associated with

u Our formula for computing the confidence interval includes the z score associated with the 95% confidence interval, as follows: µ = error) u or sample mean ± Z 95% * (sampling written in a useful form as: µ = error) © W. J. Montelpare, Ph. D. sample mean ± 1/15/2022 1. 96 * (sampling 25

u All that remains in computing confidence intervals is to determine the estimate of

u All that remains in computing confidence intervals is to determine the estimate of the error of the sample selected or the sampling error. u This error is also called the standard error of the mean, and is a measure of “the extent to which the sample means can be expected to vary due to chance”. u In other words, the standard error of the mean is “an estimate of the error associated with the observed mean in this specific sample”, and is due to the sampling characteristics associated with this sample. © W. J. Montelpare, Ph. D. 1/15/2022 26

The standard error of the mean, is computed by the formula: OR IN WORDS:

The standard error of the mean, is computed by the formula: OR IN WORDS: The standard error is equal to the standard deviation of the sample divided by the square root of the number of subjects in the sample. © W. J. Montelpare, Ph. D. 1/15/2022 27

Example: Compute the confidence interval at 95% for a given sample mean = 58

Example: Compute the confidence interval at 95% for a given sample mean = 58 ± s = 13 for n=25 subjects u i) compute s. e. using s. e. = 13 /¦ 25 s. e. = 2. 6 u u ii) the 95% confidence interval for the mean = 58 is: 58 ± [1. 96 * 2. 6] 95% confidence interval is 58 ± 5. 1 Which means that there is a 95% probability or chance that the range 52. 9 and 63. 1 will capture the true population mean µ. © W. J. Montelpare, Ph. D. 1/15/2022 28