366 5 Sampling Defined The idea Making inference

  • Slides: 39
Download presentation
366 5

366 5

Sampling • Defined / The idea – Making inference about a larger population •

Sampling • Defined / The idea – Making inference about a larger population • What is the population – Some particular value in the population • estimating a parameter

Sampling

Sampling

Sampling • Population must be defined – If interested in opinions of. . .

Sampling • Population must be defined – If interested in opinions of. . . • • All adults Registered voters Likely voters Actual voters • These are all distinct populations

Sampling • Population must be defined – If interested in opinions of. . .

Sampling • Population must be defined – If interested in opinions of. . . • • • People in Whatcom County Voters in Whatcom County People in Bellingham Voters in Bellingham Likely voters in Bellingham • These are all distinct populations

Sampling • Population must be defined – If interested in opinions of. . .

Sampling • Population must be defined – If interested in opinions of. . . • • Students at WWU Seniors at WWU (xxx # of credits & up) Students in College of Arts & Sciences etc. • These are all distinct populations; who should be included, excluded

Sampling • Sampling unit – A single member of the population • a case

Sampling • Sampling unit – A single member of the population • a case – If population = conflicts (wars) • sampling unit = nations of a certain size

Sampling • Sampling Frame • Once clear about what population & units are, how

Sampling • Sampling Frame • Once clear about what population & units are, how do we find them? – Frame = complete list of population • Registered voters; Students at WWU – In reality this may not exist • e. g. , all people living in the US

Sampling • Sampling Frame • US Census – How get ‘the list? ’ –

Sampling • Sampling Frame • US Census – How get ‘the list? ’ – $3 billion; 500, 000 workers. . .

Sampling • Sampling Frame • Registered voters; Students at WWU – Piece of cake?

Sampling • Sampling Frame • Registered voters; Students at WWU – Piece of cake? – Accuracy of sample depends on comprehensiveness of frame

Sampling • Sampling Frame • Ahead of time, evaluate for problems – Missing elements

Sampling • Sampling Frame • Ahead of time, evaluate for problems – Missing elements • New residents, newly registered voters, ? – Clusters • Census tracts, city blocks, Zip code, Area code, prefix – Take random draw of clusters, then random draw of households in cluster

Sampling • Sampling Frame • Ahead of time, evaluate for problems – Blank elements

Sampling • Sampling Frame • Ahead of time, evaluate for problems – Blank elements • Phone directories (address w/o #) • Phone #s (unassigned prefixes; fax machine; pager) • List of all residents when population = voters

Classic Sample Failure • 1936 Literary Digest Survey – Survey of 2. 4 million

Classic Sample Failure • 1936 Literary Digest Survey – Survey of 2. 4 million Americans – Predicted Alf Landon 57%, FDR 43% – Actual result FDR 62%, Landon 38% – Frame = 10 million people • subscribers to Digest; phone directories; club memberships

Classic Sample Failure • 1936 Literary Digest Survey – What went wrong?

Classic Sample Failure • 1936 Literary Digest Survey – What went wrong?

Classic Sample Failure • 2000 & 2004 & 2012 (WI) US Exit polls –

Classic Sample Failure • 2000 & 2004 & 2012 (WI) US Exit polls – Surveys of tens of thousands – 2000 initially predicted Gore win FL • Actually, Bush won – 2004 initially predicted Kerry win OH • Actually, Bush won • Frame: – Key precincts, people voting at polling places

2004 VNS Exit Polls, Ohio

2004 VNS Exit Polls, Ohio

“This can’t happen in America. Maybe in Ohio. . . ”

“This can’t happen in America. Maybe in Ohio. . . ”

 • http: //www. youtube. co m/watch? v=Ar. C 7 Xarwn WI • 2008

• http: //www. youtube. co m/watch? v=Ar. C 7 Xarwn WI • 2008 • http: //www. youtube. co m/watch? v=Io. WJkrlpt. N s

Classic Sample Failure • 2000 & 2004 US Exit polls – What went (goes)

Classic Sample Failure • 2000 & 2004 US Exit polls – What went (goes) wrong? – also response bias that favors Democrats

Sample Designs • Probability vs. Non probability sampling – Probability sample • We know

Sample Designs • Probability vs. Non probability sampling – Probability sample • We know the probability that each unit in the population has of being in the sample – Non probability sample • We don’t know if every unit has a fixed chance of being in sample

Sample Design • Probability sample – If 22% of population are white, males over

Sample Design • Probability sample – If 22% of population are white, males over 21 years of age. . . – a. 22 probability that a white, male over 21 would end up in sample

Sample Design • Probability sample – If study repeated w/ different samples, high likelihood

Sample Design • Probability sample – If study repeated w/ different samples, high likelihood that results similar – We can estimate likelihood that things observed in the sample are representative of the population

Sample Design • Real world probability sample problems – Population = likely voters –

Sample Design • Real world probability sample problems – Population = likely voters – Good sample frame? • Voters yes, likely voters no – Proper randomization • You try it – Missing elements • Land line vs. cell phones

Probability Samples • • Simple random sampling Systematic samples Stratified samples Cluster samples

Probability Samples • • Simple random sampling Systematic samples Stratified samples Cluster samples

Probability Samples • Simple random sampling – List each unit (person) in population –

Probability Samples • Simple random sampling – List each unit (person) in population – Give each a number (List from 1 to n) – Use random # generator – If 1207 comes up, select #1207 from list – Repeat

Probability Samples • Systematic sample – Have list of population, 1 – nth –

Probability Samples • Systematic sample – Have list of population, 1 – nth – Find random #, start there on list – Pick each kth unit (person) on list – Hope there is no structure to list • Starting point random, increment random – Easier • Kind of how exit polls work at polling place

Probability Sample • Stratified sample – Use available information from the population – Dived

Probability Sample • Stratified sample – Use available information from the population – Dived so elements w/ in groups (strata) are more alike than population – A series of homogeneous groups • Race/ethnicity; income – Combine samples into one • Cheaper

Probability Samples • Cluster sample – Identify clusters (groups) – Select large groups by

Probability Samples • Cluster sample – Identify clusters (groups) – Select large groups by random • Cities, congressional districts, states, neighborhoods – Randomly sample within cluster – Cheaper, no list of national US voters; consider face to face interviews

Probability Samples • • Simple random sampling Systematic samples Stratified samples Cluster samples •

Probability Samples • • Simple random sampling Systematic samples Stratified samples Cluster samples • Other types, some of these used together

Non-probability Samples • Convenience sample – All students in this class • Population =

Non-probability Samples • Convenience sample – All students in this class • Population = WWU students – First 200 people walking down Railroad Ave. • Population = Whatcom County voters – No way to know representativeness of sample

Non-probability samples • Purposive sample – Units selected subjectively – Chance of being selected

Non-probability samples • Purposive sample – Units selected subjectively – Chance of being selected depends on researcher’s judgment – “Critical elections” • Population = all US Presidential elections – “Major wars” • Population = all wars

Non-probability sample • Quota sample – Purposively select sample as representative as possible –

Non-probability sample • Quota sample – Purposively select sample as representative as possible – Use know characteristics of population – Target quota based on know characteristics

Non-probability sample • Quota sample – WWU (Fake example) • • 57% female, 43%

Non-probability sample • Quota sample – WWU (Fake example) • • 57% female, 43% male 45% A&S; 25% CST; 10% CBE; 10% Huxley; 10% other Age Ethnicity

Non-probability sample • Quota sample – Whatcom Co. (Fake example) • • Gender Age

Non-probability sample • Quota sample – Whatcom Co. (Fake example) • • Gender Age Partisanship City resident vs. County resident • Monitor demographics of respondents as you go

Non-probability sample • Quota sample – Poor person’s random sampling – Can fail to

Non-probability sample • Quota sample – Poor person’s random sampling – Can fail to predict – 1948 3 surveys predicted Dewey to win – None targeted partisanship

Internet Samples • Opt-in • Provide people computers • Huge samples asked to do

Internet Samples • Opt-in • Provide people computers • Huge samples asked to do interviews • “Weight” data after responses to represent population

Sample size • If sample random (ish), precision of estimates depend on size •

Sample size • If sample random (ish), precision of estimates depend on size • Larger = more precise estimate, all else equal • Very large doesn’t add much precision

Sample size • Diminishing returns on size • Depends on scale of population, subgroups

Sample size • Diminishing returns on size • Depends on scale of population, subgroups – Whatcom Co. – State of WA – USA

Sample size • Diminishing returns on size • Depends on scale of population, subgroups

Sample size • Diminishing returns on size • Depends on scale of population, subgroups – Whatcom Co. – State of WA – USA