Conducting Research 2 Dr Rasha Salama Ph D

Research • Research is the systematic collection, analysis and interpretation of data to answer

Steps of Scientific Research Selection of area no need for study Selection of topic

1. Study Design Descriptive studies Case report Analytical studies Observational studies Experimental studies Randomized

How could we select the best Study design ? • Purpose of the study

Purpose of the study • Study of etiology: – – – Ecologic Cross-sectional Case-control

State of existing knowledge (in relation to study question) • New idea: – Ecologic

Characteristics of the study variables • Very rare exposures: case-control design is NOT suitable

Latency • For diseases with very long latency, the costs of concurrent cohort studies

Feasibility • • Time Manpower Equipment Money

2. Population and Sampling • Sampling is the process of selection of a number

Identification of study population • The study or target population is the one upon

Determination of sampling population • The sampling population is the one from which the

Definition of the sampling unit • The definition of the sampling unit is done

Choice of sampling method • Non probability sampling • Probability sampling

Non probability sampling: • Types of non probability sampling: – Convenience sampling – Quota

Probability sampling • “There is a known non-zero probability of selection for each sampling

Simple random sample • In this method, all subject or elements have an equal

Systematic random sample • A systematic sample is conducted by randomly selecting a first

Stratified sample • In a stratified sample, we sample either proportionately or equally to

Multistage sampling Country Provinces Cities Districts Households Person

Cluster sampling • In cluster sampling we take a random sample of strata and

Multi-phase sample Population Sample Sub-sample Test 1 Test 2

Estimation of the sample size “how many subjects should be studied? ” • The

I. Effect size “magnitude of the difference to be detected” – A large sample

II. Variability of the measurement: – The variability of measurements is reflected by the

III. Level of significance: • Relies on α error or type I error. The

IV. Power of the study: • The power of the study is the probability

3. Collection of Data • Data collected are “variables” • Variables are classified according

Methods of collection of data (research tools) • Selection of the suitable technique depends

Research tools • Most important techniques: – – – Using available information (records) Observation

Choosing the Format of Your questionnaire Questions • Fixed alternative – Yes/No • Reliable

Choosing the Format of Your Interview • Unstructured – Interviewer bias is a serious

Editing Questions: Nine Mistakes to Avoid 1. Avoid leading questions 2. Avoid questions that

Measurements Errors • Definition of “error”: “A false or mistaken result obtained in a

• Sources of errors: – Subject – Observer – instrument

Bias Design Bias sample bias Study selection bias Information Bias (observer bias) Interviewer bias

Design bias Selection bias • Selection bias is a distortion of the estimate of

a. Prevalence-incidence bias • This type of bias can be introduced into a case-control

b. Admission rate (Berkson’s) bias • This type of bias is due to selective

• Non-response bias • This type of bias is due to refusals to

• Ascertainment or information bias Information bias is a distortion in the estimate

• Measurement bias • Observer variation bias – Intra-observer variation – Inter-observer variation

• Recall bias • An error of categorization may occur if information on

4. Work plan “State in specific steps what exactly will be done” • Method:

Administering the Research • • Informed consent Clear instructions Debriefing Confidentiality

5. Data management • Data management is the whole process of dealing with data

• Preparation for data entry: – Review of questionnaire forms – Unique I

• Data analysis: – Descriptive: • Tabular presentation – Frequency distribution tables –

• Analytic: The researcher uses principles of biostatistics to test his hypothesis. Detection

6. Interpretation • Discussion of the results in a way that relates data obtained

Slides: 53

Download presentation

Conducting Research (2) Dr. Rasha Salama Ph. D. Community Medicine

Research • Research is the systematic collection, analysis and interpretation of data to answer a certain question or solve a problem • It is crucial to follow cascading scientific steps when conducting one’s research

Steps of Scientific Research Selection of area no need for study Selection of topic answers found Crude research question Literature review no answer Refined research question Research hypothesis, goals and objectives Study design Ethical issues Population & sampling Variables confounding Research tools Pilot study Work plan Collection of data Data management Interpretation Reporting bias

1. Study Design Descriptive studies Case report Analytical studies Observational studies Experimental studies Randomized Controlled Clinical trials Case serial reports Cross-sectional studies Ecological studies Case-control studies Cohort studies Prospective Randomized Controlled field trials Retrospective (historical) Non-randomized experiments

How could we select the best Study design ? • Purpose of the study • State of existing knowledge (in relation to study question) • Characteristics of the study variables • Latency • Feasibility

Purpose of the study • Study of etiology: – – – Ecologic Cross-sectional Case-control Cohort Intervention • Study of therapy: – Lab experiments – Clinical trials – Community intervention

State of existing knowledge (in relation to study question) • New idea: – Ecologic – Cross-sectional • New hypothesis: – Cross-sectional – Case-control • Newly claimed association: – Case-control: replication, confirmation – Cohort: stronger evidence towards causation • Confirmed association: – Experiment/intervention: to prove causation

Characteristics of the study variables • Very rare exposures: case-control design is NOT suitable since it looks for exposure. A very large number of subjects is required. • Very rare disease: cohort design is NOT suitable since it looks for outcome. Follow-up of a huge number is required. • Acute disease: prevalence studies are not suitable • Risky exposures: clinical trials are unethical • Unavailable data: record-based studies are not suitable.

Latency • For diseases with very long latency, the costs of concurrent cohort studies or clinical trials are prohibitively high.

Feasibility • • Time Manpower Equipment Money

2. Population and Sampling • Sampling is the process of selection of a number of units from a defined study population. The process of sampling involves: 1. 2. 3. 4. 5. Identification of study population Determination of sampling population Definition of the sampling unit Choice of sampling method Estimation of the sample size

Identification of study population • The study or target population is the one upon which the results of the study will be generalized. • It is crucial that the study population is clearly defined, since it is the most important determinant of the sampling population

Determination of sampling population • The sampling population is the one from which the sample is drawn. • The definition of the sampling population by the investigator is governed by two factors: – Feasibility: reachable sampling population – External validity: the ability to generalize from the study results to the target population.

Definition of the sampling unit • The definition of the sampling unit is done by setting: – Inclusion criteria – Exclusion criteria (exclusion criteria are not the opposite of inclusion criteria)

Choice of sampling method • Non probability sampling • Probability sampling

Non probability sampling: • Types of non probability sampling: – Convenience sampling – Quota sampling • Not recommended in medical research: It is by far the most biases sampling procedure as it is not random (not everyone in the population has an equal chance of being selected to participate in the study).

Probability sampling • “There is a known non-zero probability of selection for each sampling unit” • Types: – – – Simple random sampling Systematic random sampling Stratified random sampling Multi-stage random sampling Cluster sampling Multi-phase sampling

Simple random sample • In this method, all subject or elements have an equal probability of being selected. There are two major ways of conducting a random sample. • The first is to consult a random number table, and the second is to have the computer select a random sample.

Systematic random sample • A systematic sample is conducted by randomly selecting a first case on a list of the population and then proceeding every Nth case until your sample is selected. This is particularly useful if your list of the population is long. • For example, if your list was the phone book, it would be easiest to start at perhaps the 17 th person, and then select every 50 th person from that point on.

Stratified sample • In a stratified sample, we sample either proportionately or equally to represent various strata or subpopulations. • For example if our strata were cities in a country we would make sure and sample from each of the cities. If our strata were gender, we would sample both men and women.

Multistage sampling Country Provinces Cities Districts Households Person

Cluster sampling • In cluster sampling we take a random sample of strata and then survey every member of the group. • For example, if our strata were individuals schools in a city, we would randomly select a number of schools and then test all of the students within those schools.

Multi-phase sample Population Sample Sub-sample Test 1 Test 2

Estimation of the sample size “how many subjects should be studied? ” • The sample size depends on the following factors: I. Effect size II. Variability of the measurement III. Level of significance IV. Power of the study

I. Effect size “magnitude of the difference to be detected” – A large sample size is needed for detection of a minute difference. Thus, the sample size is inversely related to the effect size.

II. Variability of the measurement: – The variability of measurements is reflected by the standard deviation or the variance. – The higher the standard deviation, the larger sample size is required. Thus, sample size is directly related to the SD

III. Level of significance: • Relies on α error or type I error. The maximum level of α has been arbitrarily set to 5% or 0. 05. • Alpha error can be minimized to 0. 01 or even 0. 001 but this consequently increases the sample size. Thus, sample size is inversely related to the level of α error.

IV. Power of the study: • The power of the study is the probability that it will yield a statistically significant result. It is related to β error or type II error. • Power is equal to (1 - β), consequently the power of the study is increased by decreasing the beta error. Thus, sample size is inversely related to the level of β error or directly related to the power of the study.

3. Collection of Data • Data collected are “variables” • Variables are classified according to their: – Type: • QT (continuous, discrete) • QL ( ordinal, nominal) – Role in the study: • Dependent • Independent – Relationship with other study factors: • • Main study variables Confounding variables Effect modifiers Intermediate factors

Methods of collection of data (research tools) • Selection of the suitable technique depends on: – – The availability of information The type of data The resources available The characteristic of the tool

Research tools • Most important techniques: – – – Using available information (records) Observation (checklist) Self-administered questionnaire Interviewing (individual/group) Measuring (all lab tests and other investigations)

Choosing the Format of Your questionnaire Questions • Fixed alternative – Yes/No • Reliable • Not powerful – Likert • Open-ended – May not be properly answered – May be difficult to score

Choosing the Format of Your Interview • Unstructured – Interviewer bias is a serious problem – Data may not be hard to analyze • Semi-structured – Follow-up questions allowed – Probably best for pilot studies • Structured – Standardized, reducing interviewer bias

Editing Questions: Nine Mistakes to Avoid 1. Avoid leading questions 2. Avoid questions that invite the social desirability bias 3. Avoid doublebarreled questions 4. Avoid long questions 5. Avoid negations 6. Avoid irrelevant questions 7. Avoid poorly worded response options 8. Avoid big words 9. Avoid ambiguous words & phrases

Measurements Errors • Definition of “error”: “A false or mistaken result obtained in a study or an experiment” John last, 2001. • Types of errors: – Systematic error: bias: “ an error having a certain magnitude and direction repeated with every measurement” – Random error: “ error with no fixed pattern of magnitude or direction”

• Sources of errors: – Subject – Observer – instrument

Bias Design Bias sample bias Study selection bias Information Bias (observer bias) Interviewer bias Measurement bias (intra and inter obs. Bias) Reporting bias Response bias Recall bias Technical bias

Design bias Selection bias • Selection bias is a distortion of the estimate of effect resulting from the manner in which the study population is selected. • This is probably the most common type of bias in health research, and occurs in observational, as well as analytical studies (including experiments).

a. Prevalence-incidence bias • This type of bias can be introduced into a case-control study as a result of selective survival among the prevalent cases. • In selecting cases, we are having a late look at the disease; if the exposure occurred years before, mild cases that improved, or severe cases that died would have been missed and not counted among the cases.

b. Admission rate (Berkson’s) bias • This type of bias is due to selective factors of admission to hospitals, and occurs in hospitalbased studies. • The diseased individuals with a second disorder, or a complication of the original disease, are more likely to be represented in a hospital-based sample than other members of the general population. • Differential rates of admission will be reflected in biased estimates of the relative risks.

• Non-response bias • This type of bias is due to refusals to participate in a study. • The individuals who do not participate are likely to be different from individuals who do participate. Non-respondents must be compared with respondents with regard to key exposure and outcome variables in order to ascertain the relative degree of non-response bias.

• Ascertainment or information bias Information bias is a distortion in the estimate of effect due to measurement error or misclassification of subjects according to one or more variables.

• Measurement bias • Observer variation bias – Intra-observer variation – Inter-observer variation • Subject (biological variation) • Technical method error variation

• Recall bias • An error of categorization may occur if information on the exposure variable is unknown or inaccurate. • The recall by both cases and controls may differ in both amount and accuracy. Cases are more likely to recall exposures, especially if there has been recent media exposure on the potential causes of the disease. • Example: In questioning mothers whose recent pregnancies had ended in fetal death or malformation (cases), and a matched group of mothers whose pregnancies had ended normally (controls), it was found that 48% of the former, but only 20% of the latter reported exposure to drugs.

4. Work plan “State in specific steps what exactly will be done” • Method: – Listing the activities related to the study (planning, implementation, results) – Identification of the responsibility for each activity – Setting time and date for achievement of each activity – Putting all these elements together in a legible form which could be a chart (GANNT chart) or a table – Budget and any funding agencies

Administering the Research • • Informed consent Clear instructions Debriefing Confidentiality

5. Data management • Data management is the whole process of dealing with data from the very beginning of the study. Data analysis is just the last part of it. • It can be divided into the following phases: – Preparation of data entry – Data analysis

• Preparation for data entry: – Review of questionnaire forms – Unique I identifier – Coding – Preparation of master-sheets (manual) or spread-sheets (computer) – Dummy tables – Quality control • Data entry

• Data analysis: – Descriptive: • Tabular presentation – Frequency distribution tables – Cross tabulations • Graphic presentation: – – Bar charts Pie charts Line graphs Others • Numeric presentation: – Percentages and percentiles – Measures of central tendency – Measures of dispersion

• Analytic: The researcher uses principles of biostatistics to test his hypothesis. Detection of proper statistical test depends on: – The objective of the study: • Descriptive • Looking for a difference • Looking for an association – Type of variable: • QT • QL – Distribution of the variable: • • Normal Binomial Poisson others

6. Interpretation • Discussion of the results in a way that relates data obtained to each other clarifying the associations and other findings.

7. Reporting comes next.

Thank you