Research planning Planning v evaluating research To a

Planning v. evaluating research To a large extent, the same thing Plan a study

Summary Quality of the research question link to previous theory (theories) precision Design and

Imaginary study Research question Do second year students have a ‘sweeter tooth’ than third

Not a terribly good research question Theoretically vacuous why would we expect third years

Could be made into a better question Link the research question, in a specific

Causal conclusion? Can’t make a causal conclusion because: quasi-experimental design There may be other

… so if result is My 2 > My 3 Could be because loss

Design of study limits conclusions Experiment, with random allocation of participants to conditions could

Result Y 2 sweetness > Y 3 ? Could be because loss of brain

Directness of measures Year of study (2 versus 3) is our IV However, “Year”

What if there is no significant difference? What can you conclude? There really is

power Probability that any particular (random) sample will produce a statistically significant effect Eg.

make it easier to detect an effect Test of F-ratio for ANOVA effect we

making it easier to detect an effect Effect size ↑ effect we are interested

tip: power & ANOVA Each effect in the ANOVA has its own power Eg.

Power and sample size All else being equal, to get more power you need

Small samples • Fewer repetitions of measurement – less reliability • Anomalies can have

Sample size – ethical issues Too small a sample -- can’t detect significant effects

Sample size – practical issues Resources Time Cost of running each participant Availability Clinical

Choosing an appropriate sample size Shortcut Base sample size on previous research (but make

if you know these… effect size variance of measures you can work out what

Effect size Do year 2 like sweet things better than year 3? Should we

Significance level (p-value) & sample size a very large sample can detect tiny effects

standardised effect size d = M 1 – M 2 s M 1 and

Confidence intervals (CI) p-value: is the difference significant? CI Is the difference significant? What

Confidence interval A range of effect sizes, with the most likely effect size in

These concepts are inter-related Desired power ↑ Acceptable p-value ↓ Effect size to detect

Slides: 30

Download presentation

Research planning

Planning v. evaluating research To a large extent, the same thing Plan a study so that it is capable of yielding data that could possibly allow you to draw a relevant conclusion from the data Evaluate other studies to check that the conclusions they claim can be drawn from their data really do follow

Summary Quality of the research question link to previous theory (theories) precision Design and ‘causal’ research questions Power Sample size Effect size Confidence intervals

Imaginary study Research question Do second year students have a ‘sweeter tooth’ than third year students? • Give WSS to a sample of current y 2 and y 3 psychology students. • Predict, My 2 > My 3 Any good as a research question?

Not a terribly good research question Theoretically vacuous why would we expect third years to lose their taste for sweet things? what psychological theories are supposed to be relevant?

Could be made into a better question Link the research question, in a specific and precise way, to previous research The sugar-experience theory claims that as people acquire more memories, they develop a more dense neural-network. This density requires more sugar for energy and fuel. The sugar-young theory claims that as people get older, they lose bits of brain stuff, and so the fuel requirements of the brain reduce. Consequently sugar becomes less desirable. Of course, it doesn’t have to be a neuropsychological theory

Causal conclusion? Can’t make a causal conclusion because: quasi-experimental design There may be other differences between second and third year students than just year of study

… so if result is My 2 > My 3 Could be because loss of brain stuff due to ageing reduces need for sugar Or, it could be that: - larger class size drives you to sugar - living on campus puts you off sugar … Or, we were unlucky, and its just one of the 5% of samples…

Design of study limits conclusions Experiment, with random allocation of participants to conditions could allow a causal conclusion Quasi-experiment, or correlational study no causal conclusion yet

Result Y 2 sweetness > Y 3 ? Could be because loss of brain stuff due to ageing reduces need for sugar Or, it could be that: - Larger class size drives you to sugar - Living on campus again puts you off sugar … Or, we were unlucky, and its just one of the 5% of samples…

Directness of measures Year of study (2 versus 3) is our IV However, “Year” is standing for the amount of neural material (one hypothesis says it is lost, the other says it is gained) Ideally, we would measure that directly. Aim for the most direct measures you can get

What if there is no significant difference? What can you conclude? There really is no effect There really is an effect, but we did not detect it because… 1. 2. 3. 4. We were unlucky (again!) Measures lack validity reliability Sample size too small

power Probability that any particular (random) sample will produce a statistically significant effect Eg. power = 0. 9 90% chance of detecting an effect if there really is an effect Researchers usually aim to have power at 80 -90%

make it easier to detect an effect Test of F-ratio for ANOVA effect we are interested in F= error variance

making it easier to detect an effect Effect size ↑ effect we are interested in F= error variance Reliability of measures ↑ Other sources of error ↓

tip: power & ANOVA Each effect in the ANOVA has its own power Eg. 2 x 3 ANOVA Main effect B Interaction effect A * B Tip: power is lower for interactions than for main effects

Power and sample size All else being equal, to get more power you need more participants Where “all else” means: reliability of measures other sources of error variance p-value the true size of the effect

Small samples • Fewer repetitions of measurement – less reliability • Anomalies can have more influence More likely to be quirky

Sample size – ethical issues Too small a sample -- can’t detect significant effects waste all participants’ time Too large a sample -- waste resources -- waste the extra participants’ time

Sample size – practical issues Resources Time Cost of running each participant Availability Clinical populations are often small Access can take time & require permission

Choosing an appropriate sample size Shortcut Base sample size on previous research (but make sure the previous research is of high quality!)

if you know these… effect size variance of measures you can work out what the sample size should be

Effect size Do year 2 like sweet things better than year 3? Should we order more sugar for the café? My 2 = 42, My 3 = 40 Effect size = 42 – 40 = 2 Statistical significance: p <. 05 Practical (‘clinical’) significance: is there an effect that matters?

Significance level (p-value) & sample size a very large sample can detect tiny effects a small sample can miss even a large effect n = 3000, a difference in mean WSS score of 0. 1 p <. 0001 n = 3, a difference in mean WSS score of 3 p >. 10 A very small p (like p =. 001) does not mean a strong effect Significance and effect size are different things

standardised effect size d = M 1 – M 2 s M 1 and M 2 are the respective population means s is an estimate of population sd. Values typically range 0 – 3 0. 2 is "small"; 0. 8 is a "large" effect (Cohen, 1977)

Confidence intervals (CI) p-value: is the difference significant? CI Is the difference significant? What is the effect size? How well have we estimated the difference?

Confidence interval A range of effect sizes, with the most likely effect size in the middle The 95% confidence interval CI 95 = 2. 37 (1. 5 – 3. 24) The data are consistent with any value in this range 95% CI 5% p-value tested If the interval includes 0, the difference is not statistically significant.

Confidence interval A range of effect sizes, with the most likely effect size in the middle The 95% confidence interval CI 95 = 2. 37 (1. 5 – 3. 24) The wider the interval, the less precisely we have measured the effect CI 95 = 2. 37 (0. 5 – 4. 24) …and the more uncertainty remains about the true effect size

Summary Quality of the research question link to previous theory (theories) precision Design and ‘causal’ research questions Power Sample size Effect size Confidence intervals

These concepts are inter-related Desired power ↑ Acceptable p-value ↓ Effect size to detect ↓ Reliability of measures ↓ Other error variance ↑ N↑ N↑ N↑