Utah BRFSS Methodology Michael Friedrichs Lead Epidemiologist Bureau
Utah BRFSS Methodology Michael Friedrichs Lead Epidemiologist Bureau of Health Promotion mfriedrichs@utah. gov
Survey Sampling � Survey sampling is concerned with the selection of a subset of individuals from within a population to estimate characteristics of the whole population.
Basic Survey Designs � Longitudinal Surveys › Cohort – study of the same population each time data are collected � Cross-Sectional Surveys › Data are collected at one point in time from a sample selected to represent a larger population
Survey Design Concepts � Scientific sample › Each respondent has a known probability of selection into the sample � Efficient sample › Results are more precise than those from other possible sample designs of the same cost › Efficiency means lower variance
Survey Design Concepts � Sampling Unit › Unit that information is being obtained from � Strata › Sample independent sub-populations (counties, school districts, phone types) � Cluster › Sample in groups (classrooms, households) � Sample Frame › source material or device from which a sample is drawn
Survey Design Concepts � Bias › The survey does not accurately represent the population › Coverage bias �Population members do not appear in sample frame (don’t have telephones, for example) › Non-response bias �Non responders may differ from responders › Selection bias �Sampling units have different chances of being selected �Accounted for by weighting the survey
Survey Design Concepts � Precision › Statistics from a sample have variability associated with them. › Sometimes referred to as the “margin of error. ” › Measure of how likely the sample is to be near the population characteristic › Depends on sample methodology as well as sample size � Design effect › The “price” of a more complex survey design compared with a simple random sample �Ratio of Variancecomplex design / Variance. SRS
Survey Design Concepts Bias Precision
Survey Modes � Personal (Face to face) › NHANES � Mail › US Census � Web (panel surveys) › Unknown probability of selection, biased � Telephone › BRFSS � Mixed Mode › ACS (census) › The future!
Mail Surveys ADVANTAGES: › Generally lowest cost › Can be administered by smaller team of people (no field staff) › Respondents can look up information or consult with others � DISADVANTAGES: › Most difficult to obtain cooperation › More likely to need an incentive for respondents › Slower data collection period than telephone �
Web Surveys � ADVANTAGES: › Lower cost (no paper, postage, mailing, data entry costs) › Time required for implementation reduced › Complex skip patterns can be programmed › Sample size can be greater � DISADVANTAGES: › Lower SES respondents may not have computer access › Differences in capabilities of people's computers and software limits extent of graphics that can be used › Representative samples difficult (or impossible? )- cannot generate random samples of general population or adjust (weight) to eliminate bias (not a scientific sample)
Telephone Interviewing � ADVANTAGES: › Low cost, high response rate › RDD samples of general population › Interviewer administration – sophisticated CATI systems › Better control and supervision of interviewers � DISADVANTAGES: › › Biased against households without telephones Nonresponse Questionnaire constraints Complexity of cell phones, portable numbers
Behavioral Risk Factor Surveillance System (BRFSS) � Initiated in 1984 in 15 states � Now done in all 50 states and 3 territories � The BRFSS is a Random Digit Dialed (RDD) survey � Multistage › Household is contacted › Randomly selected adult is interviewed � Disproportionate Stratified Sampling
BRFSS in the U. S. , 1984 Montana Idaho Minnesota Wisconsin Rhode Island Utah Illinois Ohio Indiana California Arizona Tennessee West Virginia North Carolina South Carolina
BRFSS in the U. S. , 1990 Washington Montana North Dakota Vermont Maine Minnesota Oregon New Hampshire South Dakota Idaho Wisconsin Wyoming Nebraska Nevada Utah Colorado California New York Michigan Kansas Iowa Rhode Island Connecticut Pennsylvania New Jersey Ohio Delaware Indiana Illinois West District of Columbia Virginia Missouri Virginia Kentucky Arizona New Mexico Oklahoma Tennessee North Carolina South Carolina Arkansas Alabama. Georgia Texas Mississippi Louisiana Alaska Hawaii Massachusetts Guam Puerto Rico Virgin Islands Florida Maryland
BRFSS in the U. S. , 2000 Washington Vermont Montana North Dakota Minnesota Oregon Idaho South Dakota Wyoming Nevada Utah Colorado California Arizona New Mexico New Hampshire Wisconsin New York Michigan Texas Mississippi Louisiana Florida Hawaii Massachusetts Rhode Island Connecticut Pennsylvania Iowa Nebraska New Jersey Ohio Delaware Indiana Illinois West District of Columbia Virginia Maryland Kansas Virginia Missouri Kentucky North Carolina Tennessee Oklahoma Arkansas South Carolina Alabama Georgia Alaska Maine Guam Puerto Rico Virgin Islands
BRFSS in Utah � Used to estimate many health measures › Health care coverage › Physical activity and obesity › Preventive services (cancer screenings, immunizations, seatbelt use) › Addictive and abusive substances (tobacco, excessive alcohol consumption) › Disease prevalence (diabetes, heart disease, arthritis, asthma, hypertension, etc. ) › Emerging topics…
BRFSS CASRO/AAPOR Response Rates, Utah & U. S. Median, 1999 -2013 100 90 CASRO Response Rate Utah 80 U. S. 70 60 50 40 30 20 10 0 1999 2000 2001 2002 2003 2004 2005 2006 2007 Year 2008 2009 2010 2011 2012 2013 2014 2015
BRFSS Methodology � Disproportionate Stratified Sampling Design › Stratified by phone type �high density (listed 1+ block telephone numbers) �medium density (not listed 1+ block telephone numbers) �Sample high-density (listed) vs. medium-density (nonlisted) at a ratio of 1. 5: 1 �Cell phone › Stratified by Region � 12 local health departments › 25 total strata
Fixed Core � Demographics › › � � Height & Weight Zip Code Race/Ethnicity Income and Education General Health › Health Status › Access to Care › Disability Chronic Conditions › Diabetes › Asthma › Cardiovascular Disease � Risk Behaviors › Tobacco Use › Alcohol Consumption
Rotating Cores � Odd Years › Fruits & Vegetables › Hypertension Awareness › Cholesterol Awareness › Arthritis Burden › Physical Activity � Even Years › Women’s Health › Prostate Cancer Screening › Colorectal Cancer Screening › Oral Health › Injury
Optional Modules � � � � Diabetes Visual Impairment Sleep Asthma History Immunizations Cancer Survivor Preparedness Reactions to Race � � � � Mental Illness and Stigma Cognitive Impairment Social Context Adverse Childhood Experiences Random Child Selection Childhood Asthma Prevalence Child Immunization
2013 Utah BRFSS Leg 1 Leg 2 Leg 3 6, 250 3, 125 CORE Module 1 Module 2 Stateadded 1 Stateadded 2 ~12, 500 Completes
Core Proposed Optional Modules Proposed State-Added Questions Total Core Module 1 - Pre-diabetes Module 2 - Diabetes (+ SA, gesttnl diab) Module 4: Health Care Access Module 5: Sugar Drinks Module 6: Salt-Related Behavior Module 9: Arthritis Management Module 18: Industry and Occupation Module 20 - Random Child Selection Module 21 - Childhood Asthma Prevalence Childhood Diabetes Prevalence Child Obesity-related Family Dinners Insurance and Access Sexual Orientation Pre-hypertension Prevalence PHQ-9 Tobacco Mammogram Colon Cancer Screening Hypertension Control Cholesterol Control Parkinson's Prevalence Binge Drinking CO Detector Radon Adverse Childhood Experiences Intimate Partner Violence Reproductive Health Follow-up question Questionnaire 11/21: 6, 250 respondents Questionnaire 12/22: 3, 125 respondents Questionnaire 13/23: 3, 125 respondents Men 91 2 3 7 2 3 4 2 6 2 1 5 1 Men 91 Women 91 3 7 2 6 7 2 4 2 6 2 1 5 1 30 1 13 13 2 2 1 12 1 184 1 12 1 190 Women 91 2 6 7 2 3 4 2 6 2 1 5 1 Women 91 3 7 2 3 4 2 6 2 1 5 1 6 7 2 3 4 2 6 2 1 5 1 1 1 9 6 9 2 1 2 1 4 11 1 6 9 2 2 1 4 11 1 1 177 1 181 1 174 1 178
2013 BRFSS Final Counts by Questionnaire version Qstver 11 Count 3734 Questionnaire and mode Questionnaire 1 landline 21 2114 Questionnaire 1 cellphone 12 2261 Questionnaire 2 landline 22 1107 Questionnaire 2 cellphone 13 2244 Questionnaire 3 landline 23 1168 Questionnaire 3 cellphone 20 141 Records imported from other states Total 12769
BRFSS Dual Frame Surveys � Cell phone and cell phone only populations are more likely to be younger, less affluent and disproportionately minority. � These populations are under-represented by landline only surveys. � Dual frame telephone surveys can increase coverage.
Distribution of Household Telephone Status, Utah 2014 60, 0% 52, 2% 50, 0% 40, 0% 30, 0% 18, 5% 15, 6% 20, 0% 6, 6% 10, 0% 4, 6% 2, 4% 0, 0% Wireless-only Wireless-mostly Source: National Health Interview Survey, 2014 Dual-use Landline-mostly Landline-only No Telephone Service
Wireless Only Households by Year and Age, U. S. 80, 0% 2009 70, 0% 2013 60, 0% 50, 0% 40, 0% 30, 0% 20, 0% 10, 0% 18 -24 25 -29 Source: National Health Interview Survey, 2013 30 -34 35 -44 45 -64 65 and over
BRFSS Cell Phone Sampling Frame Issues � Cell phones are selected by state › Cell phones numbers can be ported with the respondent if they move to another state. � Currently, cell phones are not a direct link to lower levels of geography (counties, cities, zip codes) › Switch Centers serve as unit of geography for cell phone 1000 -series blocks. › Less than half of the US counties have dedicated cellular switch centers
Weighting � Weighting is a technique used to assure representation of certain groups in the sample. � Data for underrepresented groups are adjusted to compensate for their small numbers. � Weighting accounts for unequal probability of selection within sampled households.
BRFSS Weighting Methodology � The BRFSS weighting methodology can be divided into two sections: › Design weights �Probability respondent is selected in the sample (phones, adults, stratum) › Iterative Proportional Fitting (Raking) �Makes sample distribution reflect state’s population
Design Weight � The probability a respondent is selected taking into account sampling stratification weight, number of phone numbers in a household, and the number of adults Design Weight = STRWT * 1 / NUM_PHONE * NUM_ADULT � STRWT accounts for differences in the basic probability of selection among strata (subset of area code/prefix combinations). � STRWT is the inverse of the sampling fraction of each stratum.
Design Weight Truncation Land Line and Cell Phone with Dual-Use Correction Table 1 a. 1 b. BRFSS , _WT 2 RAKE BY REGION State Fips Code 49 49 Region 1 2 3 4 Counts 799 1, 052 1, 169 4, 184 Mean 46. 4626 29. 7424 56. 2970 49. 0576 Sum 37, 123. 59 31, 288. 98 65, 811. 18 205, 257. 13 Maximum 103. 0933 86. 3678 148. 3403 147. 4774 3 rd Quartile 86. 3678 34. 0809 86. 3678 1 st Quartile 27. 4431 11. 4158 35. 9220 30. 3994 Minimum 8. 3070 4. 6074 1. 7817 Range 58. 9248 22. 6651 50. 4459 55. 9684 Lower Fence -60. 9440 -22. 5819 -39. 7469 -53. 5532 Upper Fence 174. 7550 68. 0787 162. 0367 170. 3205 49 49 5 6 7 8 757 890 1, 045 1, 793 54. 1791 15. 3039 23. 5675 59. 2087 41, 013. 59 13, 620. 47 24, 627. 99 106, 161. 24 143. 8507 86. 3678 98. 4682 117. 9819 86. 3678 9. 3105 19. 5737 86. 3678 34. 0809 3. 5634 7. 8999 39. 3873 5. 6366 0. 8906 2. 0523 1. 7817 52. 2869 5. 7471 11. 6738 46. 9805 -44. 3494 -5. 0572 -9. 6108 -31. 0835 164. 7982 17. 9312 37. 0844 156. 8387 49 9 1, 080 12, 769 49. 8759 53, 865. 94 578, 770. 10 121. 1127 86. 3678 32. 3136 8. 0784 54. 0542 -48. 7677 167. 4492 � In order to keep from raking excessively large, or small design weights, design weight truncation was applied. This helps the raking macro to converge more quickly. � The lower fence was calculated as LF= Q 1 -(1. 5*IQR) and the upper fence as UF= Q 3+(1. 5*IQR). � Design weights below the lower fence were replaced with the lower fence and design weights greater than the upper fence were replaced by the upper fence.
Raking Weighting
Raking Survey Weighting Methodology Necessary to combine landline and cell phone studies � Not sensitive to small cell sizes – adjusts the margins � Two things happened that allowed for “better” (less biased) methods � 1. Iterative Proportional Fitting Methodology � Complex computer programming involved 2. Control Totals Population estimates by many categories � Made available due to American Community Survey �
Raking Procedure 1. Define marginal categories: sex, race/ethnicity, education, marital status, region…. 2. Impute missing data needed for raking: race/ethnicity, age, marital status, education 3. Collapse the data based on the counts or percentages present in the data. 37
Iterative Proportional Fitting (Raking) Raking methodology allows the distribution of the sample to represent the population (state) distribution with respect to: › › › › Age group by sex Race/ethnicity Education Marital status Home Ownership Sex by race/ethnicity Age group by race/ethnicity › Telephone type › Region by age group › Region by sex › Region by race/ethnicity
ITERATION 1, Margin 1 Target Sample I 1 -1 Factor Female 50. 0% 56. 0% 50. 0% 0. 8929 Male 50. 0% 44. 0% 50. 0% 1. 1364 18 -24 20. 0% 8. 0% 7. 6% 1. 0000 25 -34 20. 0% 22. 1% 1. 0000 35 -44 20. 0% 16. 0% 15. 8% 1. 0000 45 -54 20. 0% 24. 4% 1. 0000 55+ 20. 0% 30. 2% 1. 0000 Reg 1 83. 3% 86. 0% 1. 0000 Reg 2 16. 7% 14. 0% 1. 0000 39
ITERATION 1, Margin 2 Target Sample I 1 -1 I 1 -2 Factor Female 50. 0% 56. 0% 50. 0% 53. 4% 0. 8929 Male 50. 0% 44. 0% 50. 0% 46. 7% 1. 1364 18 -24 20. 0% 8. 0% 7. 6% 20. 0% 2. 6213 25 -34 20. 0% 22. 1% 20. 0% 0. 9059 35 -44 20. 0% 16. 0% 15. 8% 20. 0% 1. 2701 45 -54 20. 0% 24. 4% 20. 0% 0. 8213 55+ 20. 0% 30. 2% 20. 0% 0. 6624 Reg 1 83. 3% 86. 0% 88. 0% 1. 0000 Reg 2 16. 7% 14. 0% 12. 0% 1. 0000 40
ITERATION 1, Margin 3 Target Sample I 1 -1 I 1 -2 I 1 -3 Factor Female 50. 0% 56. 0% 50. 0% 53. 4% 0. 8929 Male 50. 0% 44. 0% 50. 0% 46. 7% 46. 6% 1. 1364 18 -24 20. 0% 8. 0% 7. 6% 20. 0% 18. 9% 2. 6213 25 -34 20. 0% 22. 1% 20. 0% 20. 6% 0. 9059 35 -44 20. 0% 16. 0% 15. 8% 20. 0% 1. 2701 45 -54 20. 0% 24. 4% 20. 0% 20. 4% 0. 8213 55+ 20. 0% 30. 2% 20. 0% 20. 1% 0. 6624 Reg 1 83. 3% 86. 0% 88. 0% 83. 3% 0. 9466 Reg 2 16. 7% 14. 0% 12. 0% 16. 7% 1. 3928 41
Weight Trimming � Increasing the value of extremely low weights and decreasing the value of extremely high weight values � to reduce the impact of the low and high weights on the variance of the estimates. � All weights that are less than X are increased to X, and all weights that are greater than Y are reduced to Y � Adds a small amount of bias
Raking � The raking procedure continues until all margins are adjusted. � The procedure is repeated until all of the margins are within specified tolerance. 43
Which Weight to Use? Questionnaire Version 11 21 12 13 23 20 �� �� _llcpwt �� �� �� _llcpwt �� �� _lcpwtv 1 �� �� � 22 Weight _lcpwtv 2 �� �� _lcpwtv 3 �� �� _lcpwtv 23 Any crosstab must be done with the weight that is non-zero for all variables in the table
Suppression (aggregation) rules � Based on the Relative Standard Error › Sometimes called the Coefficient of Variation � RSE = SE/Rate or SE/(100 -rate) � RSE>. 30 Interpret with caution › Footnote � RSE>. 50 Estimate unreliable › Suppress or aggregate
<30% <50% GENERAL (MINIMUM CRITERIA) CV=SE/RATE REPORT WITH WARNING 30<CV<50% SUPPRESS OR AGGREGATE >50% RATE <30% >50% SURVEY DATA KIND OF POPULATION REPORT WITH WARNING CV=SE/(1 -RATE) 30<CV<50% SUPPRESS OR AGGREGATE >50% <30% <50% VULNERABLE (STRICT CRITERIA) CV=SE/RATE >30% ≥ 10 RATE <30% NUMERATOR >50% SUPPRESS OR AGGREGATE REPORT CV=SE/(1 -RATE) >30% <10 REPORT SUPPRESS OR AGGREGATE
BRFSS Radon by LHD (2013) n Total Bear River Central* Davis Salt Lake San Juan Southeast Southwest Summit** Tooele Tri-County Utah County Wasatch Weber - Morgan 5629 379 260 543 1708 36 212 327 208 212 256 762 225 501 Percent 18. 1343 15. 1882 11. 0087 20. 8945 19. 4586 11. 994 14. 8307 11. 8373 25. 2708 16. 4037 14. 7075 19. 0678 13. 8947 18. 0811 • * Statistically lower rate than the state rate • ** Statistically higher rate than the state rate SE 0. 6798 2. 4164 2. 5642 2. 2884 1. 1937 4. 9009 4. 5272 2. 1549 3. 7182 3. 0652 3. 1935 1. 7507 2. 699 2. 0789 RSE 0. 037487 0. 159097 0. 232925 0. 109522 0. 061346 0. 408613 0. 305259 0. 182043 0. 147134 0. 18686 0. 217134 0. 091814 0. 194247 0. 114976
Survey Analysis Software � Nested surveys require software that can account for the survey design › SUDAAN (research triangle) › SAS survey procs (surveyfreq, surveymeans, surveylogistic) › Stata › SPSS? �Design, nest, strata, cluster statements �Finite population correction
Survey Analysis Software � SAS code › Proc surveyfreq; › strata _ststr /list; › cluster psu ; *not required in most cases;
Survey Analysis Software � Sudaan code › proc descript › data=y 11 › filetype=sas › design=strwr; › nest _ststr /missunit ; › weight _llcpwt;
Design Effect � 2013 BRFSS Radon 1 › Have you ever had your home tested for radon gas? � Yes � No � Never Heard of Radon � Don’t own home/renting Sample Size Percent Variance Lower CI Upper CI 5, 739 18. 22 . 4553 16. 93 19. 58
Summary � Response rate is not everything › Survey design is important � Tradeoff between bias and precision � Survey weighting is undergoing a renaissance – seriously complicated � Special software for nested surveys
Questions?
- Slides: 52