# SAMPLING TECHNIQUES Mrs S Valarmathi M sc Mphil

• Slides: 88

SAMPLING TECHNIQUES Mrs. S. Valarmathi. M. sc. , Mphil. , Research Officer, Department of Epidemiology The Tamil Nadu Dr. MGR Medical University

Is it possible to taste the whole sambar and add salt NO Is it possible to work out what 50 million people think by asking only 1000? YES

What exactly IS a Population ? The entire group under study as defined by research objectives. Sometimes called the “universe. ” The totality or aggregate of all individuals with the specified characteristics is a population

TYPES OF POPULATION • Finite • Infinite • Hypothetical

What exactly IS a “sample”?

What exactly IS a “sample”? A subset of the population.

What exactly IS a “sampling”?

What exactly IS a “sampling”? Selecting and studying a small number of subjects from a specified population in order to draw inferences about the whole population

Sampling Terminology Who do you want to generalize to THEORETICAL POPULATION

Sampling Terminology Who do you want to generalize to THEORETICAL POPULATION What population can you get access to ? STUDYPOPULATION

Sampling Terminology Who do you want to generalize to THEORETICAL POPULATION What population can you get access to ? STUDYPOPULATION How Can you get access to them? SAMPLING FRAME

Sampling Terminology Who is in your study? THE SAMPLE

Sampling Terminology Who do you want to generalize to THEORETICAL POPULATION What population can you get access to ? STUDY POPULATION How Can you get access to them? SAMPLING FRAME Who is in your study? THE SAMPLE

Sampling and representative ness Study Population Sample Theoretical Population Study Population Sample

Sampling Fraction n N

Sample • Representativeness express the degree to which the sample data precisely characterize the population. • Sample should reflect the study character of the population. • Strength of statistical inference also depends on representativeness. • Confidence level 95%, 99% for population

Errors Survey Errors Random / Sampling Errors Systematic / Nonsampling Errors

Why Sampling Errors ? Sampling error can be reduced simply by increasing the sample size! S P S S When you take a sample from a population, you only have a subset of the population - a piece of what you’re trying to understand.

Standard Error IV Mean II Mean V Mean I Mean Population Mean III Mean

The sampling distribution • The distribution of an infinite number of samples of the same size as the sample in your study is known as the sampling distribution.

Standard Error The standard deviation of the sampling distribution. • It tells us something about how different samples would be distributed • A measure of sampling variability •

Systematic / Non-sampling Errors • Occurs whether a census or a sample is being used. • Results solely from the manner in which the observations are made. • Cannot be measured.

Types of Non-sampling Errors • Coverage error Excluded from frame. • Non response error Follow up on non responses. • Measurement error Bad Question!

TYPES OF SAMPLING Sampling Non-Probability Sampling

TYPES OF SAMPLING • Non-Probability Sampling • Probability Sampling

TYPES OF SAMPLING Sampling Probability Sampling Non-Probability Sampling Simple Random Convenience Quota Judgement Stratified Cluster Snowball Systematic

Convenience Sampling The sample is identified primarily by convenience. Examples: “Man on the street” Medical student in the library Volunteer samples Patient coming to OP Problem : No evidence for representativeness. HAPZHARD SAMPLE

Judgment Sampling The sampling procedure in which an experienced research selects the sample based on some appropriate characteristic of sample members… to serve a purpose (Purposive sampling, Deliberate sampling)

Quota Sampling Attempt to be representative by selecting sample elements in proportion to their known incidence in the population

Snowball sampling Typically used in qualitative research When members of a population are difficult to locate, hidden activity groups, non-cooperative groups Recruit one respondent, who identifies others, who identify others, …. Primarily used for exploratory purposes

Respondent Driven Sampling • Applicable for Hidden, Hard to reach populations – MSM, IDU • A systematic form of snowball sampling with unique identification procedure. • Depends social network of target population • Under certain assumptions may be treated as a Random sample

Steps involved in RDS • Begin with a small set of identified seeds. • Seeds recruit peers, who recruit their peers, etc. , continued till required sample size is achieved. • Recruits are linked by coupons with unique identifying numbers. • Incentives provided for participation and each successful recruit.

Wave 1 Seed Wave 2 Wave 3 Wave 4 Wave 5

Wave 1 Seed Wave 2 Wave 3 Wave 4 Wave 5

Wave 1 Seed Wave 2 Wave 3 Wave 4 Wave 5

Wave 1 Seed Wave 2 Wave 3 Wave 4 Wave 5

Wave 1 Seed Wave 2 Wave 3 Wave 4 Wave 5

Wave 1 Seed Wave 2 Wave 3 Wave 4 Wave 5

RDS: Advantages Ø No need of sampling frame / mapping Ø Ease of field operations - Target members recruit samples for you. Ø Reach less visible segment of population

CCPUR IDU Network HIV – ve HIV +ve

Non Probability Sampling Methods Convenience sampling relies upon convenience and access Judgment sampling relies upon belief that participants fit characteristics Quota sampling emphasizes representation of specific characteristics Snowball sampling relies upon respondent referrals of others with like characteristics

Probability samples A sampling that selects subjects with a known, non zero, probability. Removes possibility of bias in selection of subjects. Allows application of statistical theory to results. Important when one wishes to generalize the findings of the sample to the larger population from which samples are selected.

Simple random sampling Applicable when population homogeneous & readily available Required number randomly. of units are is small, selected Each unit of the frame has an equal non zero probability of selection.

Simple random sampling Merits • Easy to implement if list frame available or small population • Approximately satisfies the sampling model on which conventional statistics is based, so we can carry out complex analyses Demerits • Need complete list of units • Units may be scattered

SRS METHODS 1. LOTTERY METHOD 2. RANDOM NUMBERS TABLE 3. Computer Generated Random numbers

Simple random sampling Example: evaluate the prevalence of hypertension among the 1200 children attending schoolin the age group 14 to 17 years. List of children attending the school Children numerated from 1 to 1200 Sample size = 100 children Random sampling of 100 numbers between 1 and 1200

Simple random sampling

Table of random numbers

Generating Random Numbers • This is a better and perhaps more efficient for selecting a simple random sample. • Computers and even your calculators can be used to generate random digits. The randomly produced digits can be used to pick your samples. • However, a complete listing of the members of the population is needed in this type of random selection.

Excel: Enter the function = RNDBETWEEN () on any blank cell F 9 refreshes the random numbers

Through Calculator Press SHIFT · = RAN#

Systematic random sampling The defined target population is ordered and the sample is selected according to position using a skip interval

Systematic random sampling Systematically spreads sample through a list of population members In nearly all practical examples, the procedure results in a sample equivalent to SRS INTERVAL SAMPLING

Systematic random sampling N = 1200, and n = 60 sampling interval = 1200/60 = 20 List persons from 1 to 1200 Randomly select a number between 1 and 20 (ex : 8) the 1 st person selected = the 8 th on list 2 nd person = 8 + 20 = the 28 th etc. . .

Systematic sampling

1. Careful that there is no systematic rhythm to the flow or list of people. 2. If every Kth person on the list is, say, “rich” or “senior” or some other consistent pattern, avoid this method

Stratified Random Sampling A method of probability sampling in which the population is divided into different subgroups and samples are selected from each

Stratified Random Sampling Methods 1. Proportional Allocation Method 2. Equal Allocation Method

Proportional Allocation Method Epidemiological profile of tuberculosis under 12 years of age. Sample size is 120 centre 1 - 56% -67 centre 2– 24% - 29 Centre 3 – 20% - 24

Equal Allocation Method Epidemiological profile of tuberculosis under 12 years of age. Sample size is 120 centre 1 - 40 centre 2– 40 Centre 3 – 40

CHENNAI NORTH GOVT. SOUTH PRIVATE 5 SCHOOLS EACH 10 STUDENTS 5 GIRLS GOVT. EACH 10 STUDENTS 5 5 BOYS GIRLS 5 BOYS CENTRAL GOVT. PRIVATE EAST GOVT. PRIVATE

Cluster Sampling • Population by it self is divided into number of natural groups known as clusters (geographic or organizational). • The units are heterogeneous within cluster but homogeneous between clusters. • Cluster sample is obtained by selecting the clusters by simple random sampling and all the units in the sampled clusters are included in the sample

Cluster Sampling • Advantages – Sampling frame is not required – Simple and Easy – Less resources required • Disadvantages – Imprecise if units within clusters are homogeneous

Cluster Sampling Randomly select Clusters and select all subjects Randomly select Clusters and select subjects randomly

Cluster Sampling Especially useful for door-to-door personal surveys (significantly reduces costs) However, clustering increases sampling errors (people who live close together tend to be more similar)

Drawing the clusters You need : Map of the region Distribution of population (by Taluks or area) Age distribution (population 5 -12: 3%)

Taluks Mettur Sankari Salem Edapadi Omalur Yercaud Vazhappadi Attur Gangavalli Pop. 5 -12 53000 7300 106000 13000 26500 6600 40000 6600 53000 1600 220 3200 400 800 200 1200 1600

Taluks 5 -12 Mettur Sankari Salem Edapadi Omalur Yercaud Vazhappadi Attur Gangavalli 1600 220 3200 400 800 200 1200 1600 Total = 9420

Then compute sampling fraction : K = 9420/30 = 314 5 -12 Taluks Mettur Sankari Salem Edapadi Omalur Yercaud Vazhappadi Attur Gangavalli 1600 220 3200 400 800 200 1200 1600 Total = 9420 5 1 10 1 2 1 4 1 5

Drawing households and children On the spot Go to the center of the Taluk , choose direction (random) Number the houses in this direction § Ex: 21 Draw random number (between 1 and 21) to identify the first house to visit From this house progress until finding the 7 children ( itinerary rules fixed beforehand)

Multistage Sampling • Selection of subjects is done in stage by stage • Any one of the sampling schemes can be applied during each stage • Multistage sampling generally ends with unequal probability to sampling unit • Analysis procedure becomes more complex.

Multistage Sampling • Advantages – No sampling frame of population required – Most feasible approach for large populations • Disadvantages – Several sampling lists – Needs more man power

Multi Stage Sampling • District Level Household & Facility Survey – Stage – 1 Selection of District – Stage – 2 Selection of Villages – Stage – 3 Selection of Households • Immunization coverage in a state – Stage – 1 Selection of District – Stage – 2 Selection of PHCs – Stage – 3 Selection of Subcenters

Probability Sampling Methods Simple random sampling relies upon simple randomization Systematic sampling relies upon on the sampling interval Stratified sampling emphasizes dividing into groups and subgroups Cluster sampling relies upon geographical or organizational groups Clustersamplingreliesupongeographicalor ororganizationalgroups

Factors Affecting Choice of Sampling Design Sampling Frame: Existence and Size Costs Precision Desired Sub-Population Comparisons

TO SUMMARIZE • Population • Sample • Standard Error • Types Of Sampling • Non Probability sampling • Probability Sampling

Social actors are not predictable like objects. Randomized events are irrelevant to social life. Probability sampling is expensive and inefficient. Therefore… Non-probability sampling is the best approach.

We want to generalize to the population. Random events are predictable. We can compare random events to our results. Therefore… Probability sampling is the best approach.

Conclusions Probability samples are the best Beware of … • refusals • absentees • “do not know” Ensure • Validity • Precision …. . within available constraints

Conclusions If in doubt… Call an Experienced Person !!!! Or Call a statistician !!!!

Professor: Hope u understand the sampling techniques Student: It is impossible to draw conclusion for the whole population by drawing samples Professor explained the whole thing again Student: I am not convinced

Professor: Well, so next time when you go for a blood test ask them to extract all the blood from your body

THANK YOU