Chapter 11 Basic Sampling Issues What is sampling

Chapter 11 Basic Sampling Issues

What is sampling Sampling: a way of studying a subset of the population but still ensuring “generalizability” (vs. census – study of entire population) – does the study have external validity?

Definitions of Important Terms Population or Universe – entire set of elements to be studied Census – all elements that completely make up the population. Sample – a subset

Unit of Analysis Level of social life Child being studied – individuals or groups of individuals Neighborhood Elements Individual members of the population Charlie, Lucy, Linus, Patty, Violet, etc. Midtown, Natomas, Land Park, Sampling Frame List of all elements or other units containing the elements; used for drawing sample Public school rolls Phone listings Marketing list of households with children List of neighborhoods List of cities in Sacramento region

Steps in Developing a Sample Plan Step 2: Choose a Data collection Method Step 3: Choosing a Sampling Frame Step 1: Define the Population of Interest Step 5: Sample Size Boundaries Operational Implementability Step 4: Selecting a Sampling Method

Sampling Method Probability samples: Samples in which every element of the population has a known, nonzero probability of selection. Generalizable Sampling error Expensive; More time and effort needed Non-probability samples: Samples that include elements from the population selected in a nonrandom manner. Hidden agendas Biased towards well known members of the population; Biased against unusual population members

Sampling and Nonsampling Errors Parameter vs. Statistic (Estimate) Sample statistic: statistic (e. g. mean) computed from sample data - Population parameter: true value for statistic (e. g. mean) for population (we don’t know this) - Sampling error: population parameter – sample statistic (we don’t know this) - Confidence interval: interval in which we can be confident that true value lies, based on sample statistic and its standard error

Advantages Of Probability Samples 1. Information from a representative cross-section 2. Sampling error can be computed 3. Results are projectable to the total population. Disadvantages Of Probability Samples 1. More expansive than nonprobabiity samples 2. Take more time to design and execute.

Disadvantages of Nonprobability Samples 1. Sampling error cannot be computed 2. Representativeness of the sample is not known 3. Results cannot be projected to the population. Advantages of Nonprobability Samples 1. Cost less than probability 2. Can be conducted more quickly 3. Produces samples that are reasonably representative

Classification of Sampling Methods Probability Samples Systematic Cluster Stratified Simple Random Nonprobability Convenience Judgment Quota Snowball

Sampling And Nonsampling Errors Remember? Sampling Error The error that results when the sample is not perfectly representative of the population. X= +- s +- ns X = sample mean = true population mean s = sampling error ns = nonsampling error

Sampling And Nonsampling Errors Sampling Error The error that results when the sample is not perfectly representative of the population. • Administrative error: problems in the execution of the sample (can be reduced) • Random error: due to chance and cannot be avoided; but can be contolled by random sampling and…. . estimated! Measurement or Nonsampling Error Includes everything other than sampling error that can cause inaccuracy and bias (data entry, biased q’s, bad analysis etc).

Probability Sampling Methods Simple Random Sampling A probability sample is a sample in which every element of the population has a known and equal probability of being selected into the sample- EPSEM. Sample Size Probability of Selection = Population Size

Probability Sampling Methods Systematic Sampling Probability sampling in which the entire population is numbered, and elements are drawn using a skip interval. Population Size Skip Interval = Sample Size

Probability Sampling Methods Stratified Samples Probability samples that select elements from relevant population subsets to be more representative. Cluster Samples Probability sample of geographic areas

Stratified Samples Probability samples that select elements from relevant population subsets to be more representative. Three steps: In implementing a properly stratified sample: 1. Identify salient demographic or classification factors correlated with the behavior of interest. 2. Determine what proportions of the population fall into various subgroups under each stratum. • proportional allocation • disproportional or optimal allocation 3. Select separate simple random samples from each stratum

Cluster Samples Sampling units are selected in groups. 1. The population of interest is divided into mutually exclusive and exhaustive subsets. 2. A random sample of the subsets is selected. • One-stage cluster—all elements in subset selected • Two-stage cluster—elements selected in some probabilistic manner from the selected subsets

Stratified Example Reason for use Strata Divide city into districts 2. Draw random sample of households from each district. Cluster 1. Divide city into districts (clusters). 2. Draw random sample of districts. 3. Draw random sample of households from each district. To ensure desired To make it easier to do number of households in door-to-door each district. surveys.

Handout 1 – Baseball Example 1. Ramon Aviles 0. 277 2. Larry Bowa 0. 267 3. Pete Rose 4. Mike Schmidt 0. 286 5. Manny Trillo 0. 292 6. John Yukovich 0. 161 Mean = 1. 565 / 6 = 0. 261 0. 282

SRS of sample size = 2 Aviles, Bowa Aviles, Rose Aviles, Schmidt Aviles, Trillo Aviles, Yukovich Bowa, Rose Bowa, Schmidt Mean 0. 272 0. 280 0. 282 0. 285 Error +0. 011 +0. 019 +0. 021 +0. 024 0. 219 -0. 042 0. 275 +0. 014 0. 277 +0. 016

SRS of sample size = 2 Bowa, Trillo 0. 280 +0. 019 Bowa, Yukovich 0. 214 -0. 047 Rose, Schmidt 0. 284 +0. 023 Rose, Trillo 0. 287 +0. 026 Rose, Yukovich 0. 222 -0. 039 Schmidt, Trillo 0. 289 +0. 028 Schmidt, Yukovich 0. 224 -0. 037 Trillo, Yukovich 0. 227 -0. 034

Stratification Let’s divide the sample into two strata One with Yukovich and another with all others Stratum 1: Yukovich Stratum 2: Aviles, Bowa, Rose, Trillo, Schmidt

Stratified Sampling 1. Yukovich, Aviles 2. Yukovich, Bowa 3. Yukovich, Rose 4. Yukovich, Schmidt 5. Yukovich, Trillo Weight the sample. Why? For anyone from Stratum 2, multiply their value by 5

Example – Mean computation Yukovich, Schmidt Yukovich = 0. 161 Schmidt = 0. 286 Therefore, Schmidt’s value is (0. 286 * 5) which is 1. 43 Yukovich + Schmidt = 0. 161 + 1. 43 = Mean (Yukovich + Schmidt) = 1. 591 / 6 = 0. 265

Stratified Sampling 1. Yukovich Aviles 0. 258 -0. 003 2. Yukovich, Bowa 0. 249 -0. 012 3. Yukovich, Rose 0. 262 +0. 001 4. Yukovich, Schmidt 0. 265 +0. 004 5. Yukovich, Trillo 0. 270 +0. 009 What’s happening to errors of estimate?

Nonprobability Sampling Methods Convenience Samples Nonprobability samples used primarily because they are easy to collect ; Theory testing Judgment Samples Nonprobability samples in which the selection criteria are based on personal judgment that the element is representative of the population under study

Nonprobability Sampling Methods Snowball Samples Nonprobability samples in which selection of additional respondents is based on referrals from the initial respondents. Quota Samples Nonprobability samples in which a population subgroup is classified on the basis of researcher judgment Different from Stratified