Sample Size Estimation Presentation Plan Definitions Why do

  • Slides: 63
Download presentation
Sample Size Estimation

Sample Size Estimation

Presentation Plan • • • Definitions Why do we use samples? Concept of representativeness

Presentation Plan • • • Definitions Why do we use samples? Concept of representativeness Main methods of sampling Sampling error Sample size calculation

EPIDEMIOLOGIC STUDY TYPES OBSERVATIONAL STUDIES DESCRIPTIVE • Case series • Cross-sectional ANALITICAL • Case

EPIDEMIOLOGIC STUDY TYPES OBSERVATIONAL STUDIES DESCRIPTIVE • Case series • Cross-sectional ANALITICAL • Case control studies • Cohort studies EXPERIMENTAL STUDIES • Randomised controlled trials • Field trials

Randomised Controlled Trials= RCT Disease New treatment Randomization Cure Patients Diseased Standart treatment Araştırmanın

Randomised Controlled Trials= RCT Disease New treatment Randomization Cure Patients Diseased Standart treatment Araştırmanın yönü Time Cure

Cohort Studies Disease + Cause + Disease - Time Disease + Cause Disease -

Cohort Studies Disease + Cause + Disease - Time Disease + Cause Disease -

Case-Control Studies a b c d From outcome to cause Smoking and Lung Ca;

Case-Control Studies a b c d From outcome to cause Smoking and Lung Ca; Doll &Hill, 1950

Population Data Collection Sampling Sample Cross-sectional Studies Cause+ Outcome a+ Cause+ Outcome- b Cause.

Population Data Collection Sampling Sample Cross-sectional Studies Cause+ Outcome a+ Cause+ Outcome- b Cause. Outcome+ c Caused Outcome-

DEFINING THE STUDY GROUP FOR DIFFERENT STUDY TYPES v Cross-sectional studies Target population/Sample v

DEFINING THE STUDY GROUP FOR DIFFERENT STUDY TYPES v Cross-sectional studies Target population/Sample v Case-control studies Diseased and healthy v Cohort studies People exposed and unexposed v Clinical controlled trials Defining the intervention and control groups among the suitable and voluntered patients (Randomisation)

Which risk measure can be estimated from which study type?

Which risk measure can be estimated from which study type?

Sample size estimation for a descriptive survey Simple random / systematic sampling z² *

Sample size estimation for a descriptive survey Simple random / systematic sampling z² * p * q n = -------d² 1. 96²*0. 15*0. 85 -----------0. 03² = 544 Cluster sampling z² * p * q n = g* -------d² z: alpha risk express in z-score p: expected prevalence q: 1 - p d: absolute precision g: design effect 2*1. 96²*0. 15*0. 85 = 1088 ------------0. 03²

v In studies that we test hypothesis or compare groups statistical power should be

v In studies that we test hypothesis or compare groups statistical power should be taken into account v Sample size formulas include α or β (Type 1 or Type 2 error)

POWER Ho=Null hypothesis ce n ca Decision i if n g Si l e

POWER Ho=Null hypothesis ce n ca Decision i if n g Si l e v le H 1= Alternative hypothesis Reality H 0 True H 0 False ct e d ER y to nce W d e PO stu iffer he nt d t True type II error f o H 0 Accept negative ca i y f lit gni i 1 - Ab si tip I error H 0 Reject True positive 1 -

SAMPLE SIZE CALCULATION Using OPENEPI http: //www. openepi. com/Menu/OE_Menu. htm

SAMPLE SIZE CALCULATION Using OPENEPI http: //www. openepi. com/Menu/OE_Menu. htm

Question 1 v A researcher plans a survey to determine the prevalence of hypertension

Question 1 v A researcher plans a survey to determine the prevalence of hypertension among women aged 15 and over in Çeşme. What kind of information she needs to estimate the number of women to include the study? v Population of over 15 years women: 10 000 v Prevalence of hypertension reported in other studies: %30 v Absolute precision: %5

Steps Open. Epi Sample size & proportion Enter new data Size of the Population

Steps Open. Epi Sample size & proportion Enter new data Size of the Population Expected frequency Worst acceptable result Calculate

Question 2 In a cohort study the association between oral contraceptives and breast cancer

Question 2 In a cohort study the association between oral contraceptives and breast cancer risk is evaluated. What kind of information we need to calculate the sample size? v Breast cancer incidence rate is %1 in women who did not use OCA v Significance level : %5) v Statistical power(1 -β)=%90 v RR= 2. 0

Steps Open. Epi Sample size & Cohort/RCT Enter new data Calculate

Steps Open. Epi Sample size & Cohort/RCT Enter new data Calculate

Question 3 A study aims to determine the risk of smoking during pregnancy on

Question 3 A study aims to determine the risk of smoking during pregnancy on congenital urinary deformation. What kind of information are needed to estimate the number of cases and controls for the study? v Congenital urinary abnormality OR of smoking during pregnancy : 2. 3 from previous studies v Smoking during pregnancy%30 v Confidence level: %95 v Statistical power: %80

Steps Open. Epi Sample size & Unmatched CC Enter new data Calculate

Steps Open. Epi Sample size & Unmatched CC Enter new data Calculate

Sampling Methods Belgin Ünal Dokuz Eylül University Faculty of Medicine Department of Public Health,

Sampling Methods Belgin Ünal Dokuz Eylül University Faculty of Medicine Department of Public Health, İzmir

Target Population and Sample Target Population Sample

Target Population and Sample Target Population Sample

TARGET POPULATION –STUDY POPULATION § Geographically (Narlıdere Health District) § Time (Birth date between

TARGET POPULATION –STUDY POPULATION § Geographically (Narlıdere Health District) § Time (Birth date between Jan 1 and 31 Dec 2011) § Personal characteristics (age, sex, etc) (children under 5 yrs of age)

What is sampling? Procedure by which some members of a given population are selected

What is sampling? Procedure by which some members of a given population are selected as representatives of the entire population

Sampling terms Sampling unit v. Subject under observation on which information is collected §

Sampling terms Sampling unit v. Subject under observation on which information is collected § Example: Children <5 years, hospital discharges, health events… Sampling fraction v. Ratio between the sample size and the population size § Example: 100 out of 2000 (5%)

Sampling terms Sampling frame v. Any list of all the sampling units in the

Sampling terms Sampling frame v. Any list of all the sampling units in the population § List of households, health care units… Sampling scheme v. Method of selecting sampling units from sampling frame § Randomly, convenience sample…

Why do we use samples ? Get information from large populations § At minimal

Why do we use samples ? Get information from large populations § At minimal cost § At maximum speed § At increased accuracy § Using enhanced tools

Sampling Precision Cost

Sampling Precision Cost

What we need to know v Concepts § Representativeness § Sampling methods § Choice

What we need to know v Concepts § Representativeness § Sampling methods § Choice of the right study design v Calculations § Sampling error § Design effect § Sample size

Sampling and representativeness Study on prevalence of chlamydial infection in women in Berlin Female

Sampling and representativeness Study on prevalence of chlamydial infection in women in Berlin Female population of 4 city wards Women Female population of Berlin Target Population Sampling Population Sample

Representativeness v. Person • Demographic characteristics (age, sex…) • Exposure/susceptibility v. Place (urban vs.

Representativeness v. Person • Demographic characteristics (age, sex…) • Exposure/susceptibility v. Place (urban vs. rural. . ) v. Time • Seasonality • Day of the week • Time of the day Ensure representativeness before starting, confirm once completed !!!!!!

Types of samples v. Non-probability samples v. Probability samples

Types of samples v. Non-probability samples v. Probability samples

Non probability samples v. Quotas • Sample reflects population structure • Time/resources constraints v.

Non probability samples v. Quotas • Sample reflects population structure • Time/resources constraints v. Convenience samples (purposive units) • Biased • Best or worst scenario Probability of being chosen : unknown

Probability samples v. Random sampling § Each subject has an equal probability of being

Probability samples v. Random sampling § Each subject has an equal probability of being chosen § Reduces possibility of selection bias § Allows application of statistical theory to results

Sampling error v. No sample is the exact mirror image of the population v.

Sampling error v. No sample is the exact mirror image of the population v. Magnitude of error can be measured in probability samples v. Expressed by standard error § of mean, proportion, differences, etc v. Function of § amount of variability in measuring factor of interest § sample size

Quality of an estimate Precise & valid No precision Random error ! Precise but

Quality of an estimate Precise & valid No precision Random error ! Precise but not valid Systematic error (Bias) !

Example Measuring height: 179 v. Measuring tape hold differently by different 178 investigators 177

Example Measuring height: 179 v. Measuring tape hold differently by different 178 investigators 177 → loss of precision § standard error v. Tape shrunk/wrong → systematic error § bias (cannot be corrected afterwards!) 176 175 174 173

Methods used in probability samples v. Simple random sampling v. Systematic sampling v. Stratified

Methods used in probability samples v. Simple random sampling v. Systematic sampling v. Stratified sampling v. Multistage sampling v. Cluster sampling

Simple random sampling v. Principle – Equal chance of drawing each unit v. Procedure

Simple random sampling v. Principle – Equal chance of drawing each unit v. Procedure – Number all units – Randomly draw units

Simple random sampling v. Advantages – Simple – Sampling error easily measured v. Disadvantages

Simple random sampling v. Advantages – Simple – Sampling error easily measured v. Disadvantages – Need complete list of units – Does not always achieve best representativeness – Units may be scattered

Simple random sampling Example: evaluate the prevalence of tooth decay among the 1200 children

Simple random sampling Example: evaluate the prevalence of tooth decay among the 1200 children attending a school v. List of children attending the school v. Children numerated from 1 to 1200 v. Sample size = 100 children v. Random sampling of 100 numbers between 1 and 1200 How to select randomly?

Simple random sampling

Simple random sampling

Table of random numbers

Table of random numbers

Systematic sampling v. N = 1200, and n = 60 sampling fraction = 1200/60

Systematic sampling v. N = 1200, and n = 60 sampling fraction = 1200/60 = 20 v. List persons from 1 to 1200 v. Randomly select a number between 1 and 20 (ex : 8) 1 st person selected = the 8 th on the list 2 nd person = 8 + 20 = the 28 th etc. . .

Systematic sampling

Systematic sampling

1 2 16 17 31 3 4 18 19 32 33 46 47 5

1 2 16 17 31 3 4 18 19 32 33 46 47 5 48 6 20 34 35 49 50 7 8 21 22 36 51 23 37 38 52 53 9 24 39 54 10 11 12 25 26 27 40 55 41 42 ……. . 13 28 43 14 15 29 44 45 30

Systematic sampling

Systematic sampling

Stratified sampling v. Principle : § Classify population into internally homogeneous subgroups (strata) §

Stratified sampling v. Principle : § Classify population into internally homogeneous subgroups (strata) § Draw sample in each strata § Combine results of all strata

Example: Stratified sampling v. Determine vaccination coverage in a country v. One sample drawn

Example: Stratified sampling v. Determine vaccination coverage in a country v. One sample drawn in each region v. Estimates calculated for each stratum v. Each stratum weighted to obtain estimate for country (average)

Multiple stage sampling Principle v= consecutive samplings vexample : sampling unit = household §

Multiple stage sampling Principle v= consecutive samplings vexample : sampling unit = household § 1 rst stage : drawing areas or blocks § 2 nd stage : drawing buildings, houses § 3 rd stage : drawing households

Cluster sampling v. Principle § Random sample of groups (“clusters”) of units § In

Cluster sampling v. Principle § Random sample of groups (“clusters”) of units § In selected clusters, all units or proportion (sample) of units included

Example: Cluster sampling Section 1 Section 2 Section 3 Section 5 Section 4

Example: Cluster sampling Section 1 Section 2 Section 3 Section 5 Section 4

Cluster sampling v. Advantages § Simple as complete list of sampling units within population

Cluster sampling v. Advantages § Simple as complete list of sampling units within population not required § Less travel/resources required v. Disadvantages § Imprecise if clusters homogeneous and therefore sample variation greater than population variation (large design effect) § Sampling error difficult to measure

Selecting a sampling method v. Population to be studied § Size/geographical distribution § Heterogeneity

Selecting a sampling method v. Population to be studied § Size/geographical distribution § Heterogeneity with respect to variable v. Level of precision required v. Resources available v. Importance of having a precise estimate of the sampling error

Steps in estimating sample size • • • Identify major study variable Determine type

Steps in estimating sample size • • • Identify major study variable Determine type of estimate (%, mean, ratio, . . . ) Indicate expected frequency of factor of interest Decide on desired precision of the estimate Decide on acceptable risk that estimate will fall outside its real population value • Adjust for estimated design effect • Adjust for expected response rate

Sample size formula in descriptive survey Simple random / systematic sampling z² * p

Sample size formula in descriptive survey Simple random / systematic sampling z² * p * q n = -------d² 1. 96²*0. 15*0. 85 -----------0. 03² = 544 Cluster sampling z² * p * q n = g* -------d² z: alpha risk express in z-score p: expected prevalence q: 1 - p d: absolute precision g: design effect 2*1. 96²*0. 15*0. 85 = 1088 ------------0. 03²

REFERENCES v Portney LG, Watkins, MP. Foundations of Clinical Research: Applications to Practice v

REFERENCES v Portney LG, Watkins, MP. Foundations of Clinical Research: Applications to Practice v Douglas G. Altman. Practical Statistics for Medical Research