Chapter 5 Sampling Significance Levels and Hypothesis Testing

  • Slides: 44
Download presentation
Chapter 5 Sampling, Significance Levels, and Hypothesis Testing l 1 Three scientific traditions critical

Chapter 5 Sampling, Significance Levels, and Hypothesis Testing l 1 Three scientific traditions critical to experimental research – Sampling – Significance levels – Hypothesis testing Introduction to Communication Research, BU-CA

Population and Sample l l 2 Population – all units (people or things) possessing

Population and Sample l l 2 Population – all units (people or things) possessing the attributes and characteristics of interest Sample -- subset of a population Sampling frame -- subset of units that have a chance to become part of the sample Researchers study the sample to make generalizations back to the population Introduction to Communication Research, BU-CA

Defining the Population l l l 3 Choose the dimensions or characteristics meaningful to

Defining the Population l l l 3 Choose the dimensions or characteristics meaningful to the hypothesis or research question Must be at least one common characteristic among all members of a population Must develop procedure to ensure representative sampling Introduction to Communication Research, BU-CA

Addressing Generalizability l l l Extent to which conclusions developed from data collected from

Addressing Generalizability l l l Extent to which conclusions developed from data collected from sample can be extended to its population Sample is representative to the degree that all units had same chance for being selected Representative sampling eliminates selection bias – l 4 Characteristics of population should appear to the same degree in sample Representativeness can only be assured through random sampling Introduction to Communication Research, BU-CA

Probability Sampling l l 5 The probability of any unit being included in the

Probability Sampling l l 5 The probability of any unit being included in the sample is known and equal When probability for selection is equal, selection is random Also known as random sampling Sampling error will always occur Introduction to Communication Research, BU-CA

Types of Probability Sampling l Simple random sampling – l Systematic sampling – l

Types of Probability Sampling l Simple random sampling – l Systematic sampling – l Random sampling within all subgroups Cluster sampling – 6 If used on a randomly ordered frame, results in truly random sample Stratified random sampling – l Simplest and quickest Random sampling within known clusters Introduction to Communication Research, BU-CA

Simple random sampling l 7 In statistics, a simple random sample from a population

Simple random sampling l 7 In statistics, a simple random sample from a population is a sample chosen randomly, in which each member of the population has the same probability of being chosen. In small populations such sampling is typically done "without replacement", i. e. , one deliberately avoids choosing any member of the population more than once. Introduction to Communication Research, BU-CA

Simple random sampling l 8 Conceptually, simple random sampling is the simplist of the

Simple random sampling l 8 Conceptually, simple random sampling is the simplist of the probability sampling techniques, but it is seldom used in practice because of application problems. Simple random sampling is not an efficient method. It requires constructing a very large sampling frame and this results in extensive sampling calculations and excessive costs. If researchers were to consider the information available about the population, a more efficient approach could be used. Introduction to Communication Research, BU-CA

Simple random sampling l 9 Advantages are that it is free of classification error,

Simple random sampling l 9 Advantages are that it is free of classification error, and it requires minimum advance knowlegde of the population. It best suits situations where the population is fairly homogeneous and not much information is available about the population. If these conditions are not true, stratified sampling may be a better choice. Introduction to Communication Research, BU-CA

Systematic sampling l Systematic sampling is the selection of every kth element from a

Systematic sampling l Systematic sampling is the selection of every kth element from a sampling frame, where k, the sampling interval, is calculated as : k = Number in population / Number in sample 10 Introduction to Communication Research, BU-CA

Systematic sampling l l 11 Using this procedure each element in the population has

Systematic sampling l l 11 Using this procedure each element in the population has a known and equal probability of selection. This makes systematic sampling functionally similar to simple random sampling. It is however, much more effecient and much less expensive to do. The researcher must ensure that the chosen sampling interval does not hide a pattern. Any pattern would threaten randomness. A random starting point must also be selected. Introduction to Communication Research, BU-CA

Stratified random sampling l l 12 Stratified sampling is a method of sampling from

Stratified random sampling l l 12 Stratified sampling is a method of sampling from a population in statistics. When subpopulations vary considerably, it is advantageous to sample each subpopulation (stratum) independently. Stratification is the process of grouping members of the population into relatively homogeneous subgroups before sampling. The strata should be mutually exclusive : every element in the population must be assigned to only one stratum. Introduction to Communication Research, BU-CA

Stratified random sampling l 13 The strata should also be collectively exhaustive : no

Stratified random sampling l 13 The strata should also be collectively exhaustive : no population element can be excluded. Then random sampling is applied within each stratum. This often improves the representativeness of the sample by reducing sampling error. It can produce a weighted mean that has less variability than the arithmetic mean of a simple random sample of the population. Introduction to Communication Research, BU-CA

Stratified random sampling Advantages: l focuses on important subpopulations but ignores irrelevant ones l

Stratified random sampling Advantages: l focuses on important subpopulations but ignores irrelevant ones l improves the accuracy of estimation l efficient l sampling equal numbers from strata varying widely in size may be used equate the statistical power of tests of differences between strata. 14 Introduction to Communication Research, BU-CA

Stratified random sampling Disadvantages: l l 15 can be difficult to select relevant stratification

Stratified random sampling Disadvantages: l l 15 can be difficult to select relevant stratification variables not useful when there are no homogeneous subgroups can be expensive requires accurate information about the population. Introduction to Communication Research, BU-CA

Stratified random sampling 16 Introduction to Communication Research, BU-CA

Stratified random sampling 16 Introduction to Communication Research, BU-CA

Cluster sampling l 17 Cluster sampling is used when "natural" groupings are evident in

Cluster sampling l 17 Cluster sampling is used when "natural" groupings are evident in the population. The total population is divided into groups or clusters. Elements within a cluster should be as heterogeneous as possible. But there should be homogeneity between clusters. Each cluster should be a small scale version of the total population. Each cluster must be mutually exclusive and collectively exhaustive. Introduction to Communication Research, BU-CA

Cluster sampling l 18 A random sampling technique is then used on any relevent

Cluster sampling l 18 A random sampling technique is then used on any relevent clusters to choose which clusters to include in the study. In single-stage cluster sampling, all the elements from each of the selected clusters are used. In two-stage cluster sampling, a random sampling technique is applied to the elements from each of the selected clusters. Introduction to Communication Research, BU-CA

Cluster sampling l 19 The main difference between cluster sampling and stratified sampling is

Cluster sampling l 19 The main difference between cluster sampling and stratified sampling is that in cluster sampling the cluster is treated as the sampling unit so analysis is done on a population of clusters (at least in the first stage). In stratified sampling, the analysis is done on elements within strata. Introduction to Communication Research, BU-CA

Cluster sampling l 20 In stratified sampling, a random sample is drawn from each

Cluster sampling l 20 In stratified sampling, a random sample is drawn from each of the strata, whereas in cluster sampling only the selected clusters are studied. The main objective of cluster sampling is to reduce costs by increasing sampling effeciency (This contrasts with stratified sampling where the main objective is to increase precision. ). Introduction to Communication Research, BU-CA

Cluster sampling l 21 One version of cluster sampling is area sampling or geographical

Cluster sampling l 21 One version of cluster sampling is area sampling or geographical cluster sampling. Clusters consist of geographical areas. A geographically dispersed population can be expensive to survey. Greater economy than simple random sampling can be achieved by treating several respondents within a local area as a cluster. It is usually necessary to increase the total sample size to achieve equivalent precision in the estimators, but the savings in cost may make that feasible. Introduction to Communication Research, BU-CA

Cluster sampling l 22 In some situations, cluster analysis is only appropriate when the

Cluster sampling l 22 In some situations, cluster analysis is only appropriate when the clusters are approximately the same size. This can be achieved by combining clusters. If this is not possible, probability proportionate to size sampling is used. In this method, the probability of selecting an element in any given cluster varies inversely with the size of the cluster. Introduction to Communication Research, BU-CA

Cluster sampling 23 Introduction to Communication Research, BU-CA

Cluster sampling 23 Introduction to Communication Research, BU-CA

Nonprobability Sampling l l 24 Does not rely on random selection Weakens sample-to-population representativeness

Nonprobability Sampling l l 24 Does not rely on random selection Weakens sample-to-population representativeness Used when other techniques will not result in an adequate or appropriate sample Used when researchers desire participants with special experiences or abilities – including qualitative research Introduction to Communication Research, BU-CA

Nonprobability Sampling l 25 Sampling is the use of a subset of the population

Nonprobability Sampling l 25 Sampling is the use of a subset of the population to represent the whole population. Probability sampling, or random sampling, is a sampling technique in which the probability of getting any particular sample may be calculated. Nonprobability sampling does not meet this criterion and should be used with caution. Introduction to Communication Research, BU-CA

Nonprobability Sampling l 26 Nonprobability sampling techniques cannot be used to infer from the

Nonprobability Sampling l 26 Nonprobability sampling techniques cannot be used to infer from the sample to the general population. Any generalizations obtained from a nonprobability study must be filtered through ones knowledge of the topic being studied. Performing nonprobability sampling is considerably less expense than doing probability sampling. Introduction to Communication Research, BU-CA

Nonprobability Sampling Techniques l l l 27 Convenience sample Volunteer sample Snowball sample Purposive

Nonprobability Sampling Techniques l l l 27 Convenience sample Volunteer sample Snowball sample Purposive sample Quota sample Introduction to Communication Research, BU-CA

Convenience sample l l 28 members of the population are chosen based on their

Convenience sample l l 28 members of the population are chosen based on their relative ease of access. Convenience sampling does not produce a representative sample of the population because people or items are only selected for a sample if they can be accessed easily and conveniently. Introduction to Communication Research, BU-CA

Convenience sample l l the first ten people to walk through a turnstile at

Convenience sample l l the first ten people to walk through a turnstile at a sporting event, or l females in the first row of a concert. The obvious advantage of this type of sampling is its ease of use, but this is greatly offset by the sample being biased. l 29 Examples of convenient samples include selecting : the first ten cars to enter a car park Introduction to Communication Research, BU-CA

Volunteer sample l 30 A common method of volunteer sampling is phone-in sampling, used

Volunteer sample l 30 A common method of volunteer sampling is phone-in sampling, used mainly by television and radio stations to gauge public opinion on current affairs issues such as preferred political party, capital punishment, etc. People are asked to telephone their vote on a particular issue within a certain time, with no limit to the number of people who can call in. Introduction to Communication Research, BU-CA

Volunteer sample l l 31 The main advantages of phone-in sampling are that it

Volunteer sample l l 31 The main advantages of phone-in sampling are that it is cheap in terms of time and money, and very easy to monitor and control. However, the chance that the sample will be biased is very high because only those with a telephone can vote, and only those watching television or listening to radio at the time would be aware of the survey. As mentioned above, each person can make any number of calls registering their vote, and those not interested in calling will not be included. Introduction to Communication Research, BU-CA

Snowball sample l l 32 The first respondent refers a friend. The friend also

Snowball sample l l 32 The first respondent refers a friend. The friend also referes a friend, ect. With this approach, you initially contact a few potential respondents and then ask them whether they know of anybody with the same characteristics that you are looking for in your research Introduction to Communication Research, BU-CA

Purposive sample l 33 Judgmental sampling or Purposive sampling - The researcher chooses the

Purposive sample l 33 Judgmental sampling or Purposive sampling - The researcher chooses the sample based on who they think would be appropriate for the study. This is used primarily when there is a limited number of people that have expertise in the area being researched. Introduction to Communication Research, BU-CA

Quota sample l l 34 Quota sampling is a method of sampling widely used

Quota sample l l 34 Quota sampling is a method of sampling widely used in opinion polling and market research. Interviewers are each given a quota of subjects of specified type to attempt to recruit for example, an interviewer might be told to go out and select 20 adult men and 20 adult women, 10 teenage girls and 10 teenage boys so that they could interview them about their television viewing. It suffers from a number of methodological flaws, the most basic of which is that the sample is not a random sample and therefore the sampling distributions of any statistics are unknown. Introduction to Communication Research, BU-CA

Quota sample l l Advantages less costly administratively easy quick reply does not need

Quota sample l l Advantages less costly administratively easy quick reply does not need any sampling frame l l 35 Disadvantages estimates of standard deviations are not possible within quota the sampling may be unrepresentative (eg all young, attractive females ( widely used social class grouping is subjective checking of fieldwork is difficult. Introduction to Communication Research, BU-CA

Sample Size l l 36 Number of people/units for whom you need to collect

Sample Size l l 36 Number of people/units for whom you need to collect data Determined prior to selecting sample Less than the number you ask to participate The larger the sample relative to the population, the less error or bias Introduction to Communication Research, BU-CA

Comparisons of Sample Size to Population Size Sample Size 500 222 15, 000 390

Comparisons of Sample Size to Population Size Sample Size 500 222 15, 000 390 1000 286 20, 000 392 1500 361 25, 000 397 2000 333 50, 000 398 2500 345 100, 000 400 3000 353 >100, 000 400 Taro Yamane : confident Interval 95% 37 Significance ± 5% Introduction to Communication Research, BU-CA

Comparisons of Sample Size to Population Size Sample Size 1500 563 15, 000 846

Comparisons of Sample Size to Population Size Sample Size 1500 563 15, 000 846 2, 000 621 20, 000 861 2, 500 622 25, 000 869 3, 000 692 50, 000 884 3, 500 716 100, 000 892 4, 000 735 >100, 000 900 Taro Yamane : confident Interval 99% 38 Significance ± 5% Introduction to Communication Research, BU-CA

Significance Levels l l l 39 The researcher sets the significance level, or p,

Significance Levels l l l 39 The researcher sets the significance level, or p, for each statistical test The degree of error the researcher finds acceptable in a statistical test An estimate of what would happen if the study were actually repeated many times Generally. 05 is accepted level More precisely, in traditional frequentist statistical hypothesis testing, the significance level of a test is the maximum probability of accidentally rejecting a true null hypothesis Introduction to Communication Research, BU-CA

Significance Levels l l 40 . 05 significance level = 5 out of 100

Significance Levels l l 40 . 05 significance level = 5 out of 100 findings that appear to be valid will be due to chance Also known as the alpha level or p If p >. 05, the finding is nonsignificant If p is . 05, the finding is significant or real Introduction to Communication Research, BU-CA

Significance Levels l 41 For example, one may choose a significance level of, say,

Significance Levels l 41 For example, one may choose a significance level of, say, 5%, and calculate a critical value of a statistic (such as the mean) so that the probability of it exceeding that value, given the truth of the null hypothesis, would be 5%. If the actual, calculated statistic value exceeds the critical value, then it is significant "at the 5% level. " Introduction to Communication Research, BU-CA

Hypothesis Testing l l l Hypothesis states the expected relationship or difference between two

Hypothesis Testing l l l Hypothesis states the expected relationship or difference between two or more variables Alternative hypothesis presented in report Null is statistically tested – – 42 Act of decision making based on the significance level Decision based on comparison between p set before study to p produced by statistical test Introduction to Communication Research, BU-CA

Hypothesis Testing l l l 43 Belief in the null hypothesis continues until there

Hypothesis Testing l l l 43 Belief in the null hypothesis continues until there is sufficient evidence to the contrary If p for statistical test exceeds significance level, null is retained (p >. 05) If p for statistical test is . 05 then alternative hypothesis is accepted Introduction to Communication Research, BU-CA

Error in Hypothesis Testing 44 In reality, the null hypothesis is true In reality,

Error in Hypothesis Testing 44 In reality, the null hypothesis is true In reality, the null hypothesis is false Use level of significance to reject null Type I error – Null is rejected even though it is true Decision 1 – Null is rejected when it is false Use level of significance to retain the null Decision 2 – Null Type II error – is retained when Null is retained it is true even though it is false Introduction to Communication Research, BU-CA