Sampling Techniques Ph D Course work Lectures Sampling

  • Slides: 37
Download presentation
Sampling Techniques Ph. D. Course work Lectures

Sampling Techniques Ph. D. Course work Lectures

Sampling Terms � Population � The entire group of people/objects of interest from whom

Sampling Terms � Population � The entire group of people/objects of interest from whom the researcher needs to obtain information. � A population can be defined as set or collection all people or items/objects with the characteristic one wishes to study/understand. � It depends on the objective of your research � It should be identified properly � Appropriate inclusion-exclusion criterion should be identified 2

�An element/unit is an object on which a measurement is taken. �A population is

�An element/unit is an object on which a measurement is taken. �A population is a collection of elements about which we wish to make an inference. �Sampling units are non-overlapping collection of elements from the population that cover the entire population. 3

�A sampling frame is a list of sampling units. �A sample is a collection

�A sampling frame is a list of sampling units. �A sample is a collection of sampling units drawn from a sampling frame. � Parameter: numerical characteristic of a population, like population mean, variance, correlation etc. � Statistic: numerical characteristic of a sample 4

�A census study occurs if the entire population is included in the study. In

�A census study occurs if the entire population is included in the study. In census study, information is collected from every unit/element of the population. � Because there is not enough time or money to gather information from everyone or everything in a population, the goal becomes essentially of finding a representative sample (or subset) of the population. 5

Contd…. � Note that the population from which the sample is drawn may not

Contd…. � Note that the population from which the sample is drawn may not be the same as the population about which we actually want information. � Sometimes they may be entirely separate for instance, we might study rats in order to get a better understanding of human health, or we might study records from people born in 2008 in order to make predictions about people born in 2009 � The population should be defined in connection with the objectives of the study. 6

Contd…. � Define target population –the population for which we generalize our research findings.

Contd…. � Define target population –the population for which we generalize our research findings. � Target population could be much larger than the study population. � It is critical to the success of the research project to clearly define the target population- � Sampling frame: – the complete list of the population units( in finite population case) � Sampling units/study units: – the elements or units considered for inclusion in the sample 7

Sampling process � The sampling process comprises several stages: • Defining the target population

Sampling process � The sampling process comprises several stages: • Defining the target population and study population • Specifying a sampling frame - a set of people/ items or • • • events possible to measure Specifying a sampling method for selecting items or events from the frame Determining the sample size Implementing the sampling plan Sampling and data collection Reviewing the sampling process 8

Sampling � The process of obtaining a subset (sample) of a larger group (population)

Sampling � The process of obtaining a subset (sample) of a larger group (population) and use the results from the sample to make decisions /estimates about the population. � Faster and cheaper than investigating the entire population(census) 9

� Two keys • Selecting the right people/units � Have to be selected scientifically

� Two keys • Selecting the right people/units � Have to be selected scientifically so that they are representative of the population • Selecting the right number of the right people � To minimize sampling errors, i. e. choosing the wrong people by chance 10

Characteristics of a good sample �Representative of the population �Accessible �Cost effective �Of the

Characteristics of a good sample �Representative of the population �Accessible �Cost effective �Of the right size �Obtained with minimum sampling error �It should be suitable for analysis as per the study design 11

Errors of nonobservation � The difference between an estimate from an ideal sample and

Errors of nonobservation � The difference between an estimate from an ideal sample and the true population value is the sampling error. � Almost always, the sampling frame does not match up perfectly with the target population, leading to errors of coverage. 12

� Non-response errors. is probably the most serious of these • Arises in three

� Non-response errors. is probably the most serious of these • Arises in three ways: �Inability of the person responding to come up with the answer �Refusal to answer �Inability to contact the sampled elements 13

Errors of observation �These errors can be classified as due to the interviewer, respondent,

Errors of observation �These errors can be classified as due to the interviewer, respondent, instrument, or method of data collection. � Interviewers have a direct and dramatic effect on the way a person responds to a question. • Most people tend to side with the view apparently favored by the interviewer, especially if they are neutral. • Friendly interviewers are more successful. • In general, interviewers of the same gender, racial, and ethnic groups as those being interviewed are slightly more successful. 14

� Respondents differ greatly in motivation to answer correctly and in ability to do

� Respondents differ greatly in motivation to answer correctly and in ability to do so. � Obtaining difficult. � Basic • • an honest response to sensitive questions is errors Recall bias: simply does not remember Prestige bias: exaggerates to ‘look’ better Intentional deception: lying Incorrect measurement: does not understand the units or definition 15

Types of samples/Sampling procedures � Probability sampling: � Scientific approach to select representative part

Types of samples/Sampling procedures � Probability sampling: � Scientific approach to select representative part of the population. � Every possible sample has a probability of selection which could be equal or unequal, but predetermined. � Inclusion probabilities of sampling units is defined. � Prejudiced selection/biased selection of units is avoided 16

�A probability sampling scheme is one in which every unit in the population has

�A probability sampling scheme is one in which every unit in the population has a chance (greater than zero) of being selected in the sample, and this probability can be accurately determined. �. When every element in the population does have the same probability of selection, this is known as an 'equal probability of selection' (EPS) design. Such designs are also referred to as 'selfweighting' because all sampled units are given the same weight. 17

Contd…. • simple random sampling • systematic sampling • stratified sampling • cluster sampling

Contd…. • simple random sampling • systematic sampling • stratified sampling • cluster sampling --Multistage and multi-phase sampling ( not discussed ) 18

Simple Random Sampling(SRS) � The most elementary type of sampling, but requires complete list

Simple Random Sampling(SRS) � The most elementary type of sampling, but requires complete list ( knowledge about all population units ). � Units are independently and randomly selected one at a time until the desired sample size is achieved. � If the unit is chosen only once without replacing it in the population , it is called simple random sampling without replacement(SRSWOR) 19

�A simple procedure to select a simple random sample is the lottery method. List

�A simple procedure to select a simple random sample is the lottery method. List all the units and prepare slips containing the serial numbers of the study units. Mix the slips thoroughly , select the slips one by one and note down the numbers. � The population units bearing the noted numbers are the sample units selected. � Continue the selection until we get the required number of sample units. � This procedure is difficult to use when the population is large. 20

Use of random numbers � We can use random number tables to obtain random

Use of random numbers � We can use random number tables to obtain random samples from a given population. � The random number tables are composed of the numbers 0 through 9 , with approximately equal frequency. � In every page digits are printed in blocks of five rows and five columns. � While selecting random numbers , we can start with a random page, random row and random column. 21

�Now softwares / random number generators are available for getting the list of random

�Now softwares / random number generators are available for getting the list of random numbers. �Random number generator 22

�Advantages �Estimates are easy to calculate. �Simple random sampling is always an EPS design.

�Advantages �Estimates are easy to calculate. �Simple random sampling is always an EPS design. �Disadvantages �If sampling frame is large, this method is not practical. ( Using software simplifies the task!!! ) �Minority subgroups of interest in population may not be present in sample in sufficient numbers for study. 23

SYSTEMATIC SAMPLING �Systematic sampling relies on arranging the target population according to some ordering

SYSTEMATIC SAMPLING �Systematic sampling relies on arranging the target population according to some ordering scheme and then selecting elements at regular intervals through that ordered list. �Systematic sampling involves a random start and then proceeds with the selection of every kth element from then onwards. In this case, k=(population size/sample size). 24

�It is important that the starting point is not automatically the first in the

�It is important that the starting point is not automatically the first in the list, but is instead randomly chosen from within the first to the kth element in the list. �A simple example would be to select every 10 th name from the telephone directory (an 'every 10 th' sample, also referred to as 'sampling with a skip of 10'). 25

� As described above, systematic sampling is an EPS method, because all elements have

� As described above, systematic sampling is an EPS method, because all elements have the same probability of selection (in the example given, one in ten). It is not 'simple random sampling' because different subsets of the same size have different selection probabilities - e. g. the set {4, 14, 24, . . . , 994} has a one-in-ten probability of selection, but the set {4, 13, 24, 34, . . . } has zero probability of selection. 26

SYSTEMATIC SAMPLING 27

SYSTEMATIC SAMPLING 27

� ADVANTAGES: � Sample is easy to � Suitable sampling select frame can be

� ADVANTAGES: � Sample is easy to � Suitable sampling select frame can be identified easily � Sample units are evenly spread over entire reference population � DISADVANTAGES: � Sample may be biased if hidden periodicity in population coincides with that of selection. � Difficult to assess precision of estimate from one survey. 28

STRATIFIED SAMPLING Where population embraces a number of distinct categories, the frame can be

STRATIFIED SAMPLING Where population embraces a number of distinct categories, the frame can be organized into separate "strata. " Each stratum is then sampled as an independent sub-population, out of which individual elements can be randomly selected. � Every unit in a stratum has same chance of being selected. � Using same sampling fraction for all strata ensures proportionate representation in the sample. � Adequate representation of minority subgroups of interest can be ensured by stratification & varying sampling fraction between strata as required. 29

� Finally, since each stratum is treated as an independent population, different sampling approaches

� Finally, since each stratum is treated as an independent population, different sampling approaches can be applied to different strata. � Drawbacks : � First, sampling frame of entire population has to be prepared separately for each stratum � Second, when examining multiple criteria, stratifying variables may be related to some, but not to others, further complicating the design, and potentially reducing the utility of the strata. � Finally, in some cases (such as designs with a large number of strata, or those with a specified minimum sample size per group), stratified sampling can potentially require a larger sample than would other methods 30

Draw a sample from each stratum 31

Draw a sample from each stratum 31

CLUSTER SAMPLING � Cluster sampling is an example of 'two-stage sampling'. � First stage-

CLUSTER SAMPLING � Cluster sampling is an example of 'two-stage sampling'. � First stage- a sample of areas is chosen; � Second stage- a sample of respondents within those areas is selected. � Population divided into clusters of homogeneous units, usually based on geographical contiguity. � Sampling units are groups rather than individuals. � A sample of such clusters is then selected. � All units from the selected clusters are studied. 32

�Advantages : �Cuts down on the cost of preparing a sampling frame. �This can

�Advantages : �Cuts down on the cost of preparing a sampling frame. �This can reduce travel and other administrative costs. �Disadvantages: sampling error is higher for a simple random sample of same size. �Often used to evaluate some health schemes in epidemological studies. 33

Types of cluster sampling methods. One-stage sampling. All of the elements within selected clusters

Types of cluster sampling methods. One-stage sampling. All of the elements within selected clusters are included in the sample. Two-stage sampling. A subset of elements within selected clusters are randomly selected for inclusion in the sample 34

Difference Between Strata and Clusters � Although strata and clusters are both nonoverlapping subsets

Difference Between Strata and Clusters � Although strata and clusters are both nonoverlapping subsets of the population, they differ in several ways. � All strata are represented in the sample; but only a subset of clusters are represented in the sample. � With stratified sampling, the best survey results occur when elements within strata are internally homogeneous. However, with cluster sampling, the best results occur when elements within clusters are internally heterogeneous 35

Self study topic Non-probability sampling: Convenient sampling Consecutive sampling Quota sampling Judgement sampling/Purposive Snowball

Self study topic Non-probability sampling: Convenient sampling Consecutive sampling Quota sampling Judgement sampling/Purposive Snowball sampling When to use? Drawbacks 36

37

37