Experiment Basics Control Psych 231 Research Methods in
Experiment Basics: Control Psych 231: Research Methods in Psychology
Quiz 7 due Friday Mar. 4 n Exam 2 three weeks from today n Announcements
n n Independent variables Dependent variables n Measurement • Scales of measurement • Errors in measurement n Extraneous variables n n n Control variables Random variables Confound variables Variables
n Control variables n n Holding things constant - Controls for excessive random variability Random variables – may freely vary, to spread variability equally across all experimental conditions n Randomization • A procedure that assures that each level of an extraneous variable has an equal chance of occurring in all conditions of observation. Extraneous Variables
n Mythbusters examine: Yawning (4 mins) Earlier version, of exp 1 (6. 5 mins) n n n What sort of sampling method? Why the control group? Should they have confirmed? • Probably not, if you do the stats, with this sample size the 4% difference isn’t big enough to reject the null hypothesis • What the stats do: quantify how much random variability (error) there is compared to observed variability and held you decide if the observed variability is likely due to the error or the manipulated variability Experimental Control Reggie. Net: Provine (2005). Yawning. American Scientist, 93(6), 532 -539.
n Control variables n n Holding things constant - Controls for excessive random variability Random variables – may freely vary, to spread variability equally across all experimental conditions n Randomization • A procedure that assures that each level of an extraneous variable has an equal chance of occurring in all conditions of observation. n Confound variables n n Variables that haven’t been accounted for (manipulated, measured, randomized, controlled) that can impact changes in the dependent variable(s) Co-varys with both the dependent AND an independent variable Extraneous Variables
n Divide into two groups: n n men women n Instructions: Read aloud the COLOR that the words are presented in. When done raise your hand. n Women first. Men please close your eyes. Okay ready? n Colors and words
Blue Green Red Purple Yellow Green Purple Blue Red Yellow Blue Red Green List 1
n n n Okay, now it is the men’s turn. Remember the instructions: Read aloud the COLOR that the words are presented in. When done raise your hand. Okay ready?
Blue Green Red Purple Yellow Green Purple Blue Red Yellow Blue Red Green List 2
n So why the difference between the results for men versus women? n Is this support for a theory that proposes: n n “Women are good color identifiers, men are not” Why or why not? Let’s look at the two lists. Our results
Matched List 1 List 2 Women Men Blue Green Red Purple Yellow Green Purple Blue Red Yellow Blue Red Green Mis-Matched
n Blue Green Red Purple Yellow Green Purple Blue Red Yellow Blue Red Green n n What resulted in the performance difference? n Our manipulated independent variable (men vs. women) Our question of interest n The other variable match/mis-match? Because the two variables are perfectly correlated we can’t tell This is the problem with confounds IV Co-vary together ? DV Confound that we can’t rule out Blue Green Red Purple Yellow Green Purple Blue Red Yellow Blue Red Green
Blue Green Red Purple Yellow Green Purple Blue Red Yellow Blue Red Green n What DIDN’T result in the performance difference? n Extraneous variables n Control • # of words on the list • The actual words that were printed n Random • Age of the men and women in the groups • Majors, class level, seating in classroom, … n These are not confounds, because they don’t co-vary with the IV Blue Green Red Purple Yellow Green Purple Blue Red Yellow Blue Red Green
n Our goal: n To test the possibility of a systematic relationship between the variability in our IV and how that affects the variability of our DV. IV n DV variability Control is used to: • Minimize excessive variability • To reduce the potential of confounds (systematic variability not part of the research design) Experimental Control
n Our goal: n To test the possibility of a systematic relationship between the variability in our IV and how that affects the variability of our DV DV. T = NRexp + NRother + R Nonrandom (NR) Variability NRexp: Manipulated independent variables (IV) • Our hypothesis: the IV will result in changes in the DV NRother: extraneous variables (EV) which covary with IV • Condfounds Random (R) Variability • Imprecision in measurement (DV) • Randomly varying extraneous variables (EV) Experimental Control
n Variability in a simple experiment: T = NRexp + NRother + R Treatment group NR other NR exp R “perfect experiment” - no confounds (NRother = 0) Absence of the treatment Control group (NRexp = 0) R Bigger the weight = more variability from a source Experimental Control: Weight analogy
n Variability in a simple experiment: T = NRexp + NRother + R Control group Treatment group NR exp R R Difference Detector Our experiment is a “difference detector” We can’t “see” what’s on the scales This is the only part we “see” Experimental Control: Weight analogy
n If there is an effect of the treatment then NRexp will ≠ 0 Control group Treatment group R NR exp R Difference Detector Our experiment can detect the effect of the treatment Experimental Control: Weight analogy
n Variability in a simple experiment: Try it out at home: using coins as weights, with your eyes closed, can you tell different combinations apart? Treatment group Difference Detector Control group Bigger the weight = more variability from a source Experimental Control: Weight analogy
n Potential Problems n n Excessive random variability Confounding Treatment group Control group Difference Detector Things making detection difficult
n Excessive random variability n If experimental control procedures are not applied • Then R component of data will be excessively large, and may make NRexp undetectable Potential Problems
n If R is large relative to NRexp then detecting a difference may be difficult R R NR exp Difference Detector Experiment can’t detect the effect of the treatment Excessive random variability
n But if we reduce the size of NRother and R relative to NRexp then detecting gets easier n So try to minimize this by using good measures of DV, good manipulations of IV, etc. R NR exp R Difference Detector Our experiment can detect the effect of the treatment Reduced random variability
n Confound n If an EV co-varies with IV, then NRother component of data will be present, and may lead to misattribution of effect to IV This relationship may or may not exist IV DV Co-vary together EV IV = independent var DV = dependent var EV = extraneous var Potential Problems
n Confound n Hard to detect the effect of NRexp because the effect looks like it could be from NRexp but could be due to the NRother R NR other NR exp R Difference Detector Experiment can detect an effect, but can’t tell where it is from Confounding
n Confound n Hard to detect the effect of NRexp because the effect looks like it could be from NRexp but could be due to the NRother These two situations look the same R NR other R NR NR exp other R Difference Detector There is an effect of the IV Confounding R Difference Detector There is not an effect of the IV
n Confound n Hard to detect the effect of NRexp because the effect looks like it could be from NRexp but could be due to the NRother Use experimental control to eliminate the variability from the confound Use experimental control to spread the variability equally across conditions NRother R NR exp R NR R exp Difference Detector R R Difference Detector Removing Confounding
n How do we introduce control? n Methods of Experimental Control • Constancy/Randomization • Comparison • Production Controlling Variability
n Constancy/Randomization n If there is a variable that may be related to the DV that you can’t (or don’t want to) manipulate • Control variable: hold it constant (so there isn’t any variability from that variable, no R weight from that variable) • Random variable: let it vary randomly across all of the experimental conditions (so the R weight from that variable is the same for all conditions) Methods of Controlling Variability
n Comparison n An experiment always makes a comparison, so it must have at least two groups (2 sides of our scale in the weight analogy) • Sometimes there are control groups • This is often the absence of the treatment Training group • • No training (Control) group Without control groups if is harder to see what is really happening in the experiment • It is easier to be swayed by plausibility or inappropriate comparisons (see diet crystal example) Useful for eliminating potential confounds (think about our list of threats to internal validity) Methods of Controlling Variability
n Comparison n An experiment always makes a comparison, so it must have at least two groups • Sometimes there are control groups • This is often the absence of the treatment • Sometimes there a range of values of the IV 1 week of Training group 2 weeks of Training group 3 weeks of Training group Methods of Controlling Variability
n Production n The experimenter selects the specific values of the Independent Variables 1 week of Training group 2 weeks of Training group 3 weeks of Training group n selects the specific values variability 1 weeks 2 weeks 3 weeks Duration taking the training program Methods of Controlling Variability
n Production n The experimenter selects the specific values of the Independent Variables 1 week of Training group 2 weeks of Training group 3 weeks of Training group • Need to do this carefully • Suppose that you don’t find a difference in the DV across your different groups • Is this because the IV and DV aren’t related? • Or is it because your levels of IV weren’t different enough Methods of Controlling Variability
n n So far we’ve covered a lot of the general details of experiments Now let’s consider some specific experimental designs. n n Some bad (but not uncommon) designs (and potential fixes) Some good designs • • 1 Factor, two levels 1 Factor, multi-levels Factorial (more than 1 factor) Between & within factors Experimental designs
n Bad design example 1: Does standing close to somebody cause them to move? (theory of personal space) n n “hmm… that’s an empirical question. Let’s see what happens if …” So you stand closely to people and see how long before they move Problem: no control group to establish the comparison group (this design is sometimes called “one-shot case study design”) Fix: introduce a (or some) comparison group(s) Very Close (. 2 m) Close (. 5 m) Not Close (1. 0 m) Poorly designed experiments
n Bad design example 2: n Does a relaxation program decrease the urge to smoke? n 2 groups • relaxation training group • no relaxation training group n Training group No training (Control) group The participants choose which group to be in Poorly designed experiments
n Bad design example 2: Non-equivalent control groups Self Assignment Independent Variable Dependent Variable Training group Measure No training (Control) group Measure participants Random Assignment Problem: selection bias for the two groups Fix: need to do random assignment to groups Poorly designed experiments
n Bad design example 3: n Does a relaxation program decrease the urge to smoke? n Pre-test desire level Give relaxation training program Post-test desire to smoke n n Poorly designed experiments
n Bad design example 3: One group pretest-posttest design Dependent Independent Variable Dependent Variable Pre vs. Post Variable participants Pre-test Training group Post-test Measure Post-test No Training Fix: Add Pre-test Measure group another factor Problems include: history, maturation, testing, and more Poorly designed experiments
n n So far we’ve covered a lot of the general details of experiments Now let’s consider some specific experimental designs. n n Some bad (but not uncommon) designs Some good designs • • 1 Factor, two levels 1 Factor, multi-levels Factorial (more than 1 factor) Between & within factors Experimental designs
n Good design example n What are our IV and DV? How does anxiety level affect test performance? • Two groups take the same test • Grp 1(low anxiety group): 5 min lecture on how good grades don’t matter, just trying is good enough • Grp 2 (moderate anxiety group): 5 min lecture on the importance of good grades for success n 1 Factor (Independent variable), two levels • Basically you want to compare two treatments (conditions) • The statistics are pretty easy, a t-test 1 factor - 2 levels
n Good design example n How does anxiety level affect test performance? Random Assignment Anxiety Dependent Variable Low Test Moderate Test participants 1 factor - 2 levels
n Good design example n How does anxiety level affect test performance? anxiety low moderate 60 80 test performance One factor Use a t-test to see if these points are statistically different Observed difference between conditions T-test = Difference expected by chance low Two levels 1 factor - 2 levels moderate anxiety
n Advantages: n n Simple, relatively easy to interpret the results Is the independent variable worth studying? • If no effect, then usually don’t bother with a more complex design n Sometimes two levels is all you need • One theory predicts one pattern and another predicts a different pattern 1 factor - 2 levels
n Disadvantages: n “True” shape of the function is hard to see • Interpolation and Extrapolation are not a good idea Interpolation test performance What happens within of the ranges that you test? low 1 factor - 2 levels moderate anxiety
n Disadvantages: n “True” shape of the function is hard to see • Interpolation and Extrapolation are not a good idea Extrapolation test performance What happens outside of the ranges that you test? low moderate anxiety 1 factor - 2 levels high
n n So far we’ve covered a lot of the general details of experiments Now let’s consider some specific experimental designs. n n Some bad (but not uncommon) designs Some good designs • • 1 Factor, two levels 1 Factor, multi-levels Factorial (more than 1 factor) Between & within factors Experimental designs
n n For more complex theories you will typically need more complex designs (more than two levels of one IV) 1 factor - more than two levels n n Basically you want to compare more than two conditions The statistics are a little more difficult, an ANOVA (Analysis of Variance) 1 Factor - multilevel experiments
n Good design example (similar to earlier ex. ) n How does anxiety level affect test performance? • Groups take the same test • Grp 1(low anxiety group): 5 min lecture on how good grades don’t matter, just trying is good enough • Grp 2 (moderate anxiety group): 5 min lecture on the importance of good grades for success • Grp 3 (high anxiety group): 5 min lecture on how the students must pass this test to pass the course 1 Factor - multilevel experiments
Random Assignment participants Anxiety Dependent Variable Low Test Moderate Test High Test 1 factor - 3 levels
low mod high 60 80 60 test performance anxiety low mod high anxiety 1 Factor - multilevel experiments
n Advantages n Gives a better picture of the relationship (functions other than just straight lines) 2 levels test performance low moderate anxiety n 3 levels low mod high anxiety Generally, the more levels you have, the less you have to worry about your range of the independent variable 1 Factor - multilevel experiments
n Disadvantages n n Needs more resources (participants and/or stimuli) Requires more complex statistical analysis (ANOVA [Analysis of Variance] & follow-up pair -wise comparisons) 1 Factor - multilevel experiments
n The ANOVA just tells you that not all of the groups are equal. n If this is your conclusion (you get a “significant ANOVA”) then you should do further tests to see where the differences are • High vs. Low • High vs. Moderate • Low vs. Moderate Pair-wise comparisons
n n So far we’ve covered a lot of the about details experiments generally Now let’s consider some specific experimental designs. n n Some bad (but common) designs Some good designs • • 1 Factor, two levels 1 Factor, multi-levels Factorial (more than 1 factor) Between & within factors Experimental designs
n Two or more factors n Some vocabulary • Factors - independent variables • Levels - the levels of your independent variables • 2 x 4 design means two independent variables, one with 2 levels and one with 4 levels • “Conditions” or “groups” is calculated by multiplying the levels, so a 2 x 4 design has 8 different conditions A 1 B 2 B 3 B 4 A 2 Factorial experiments
Two or more factors n n Main effects - the effects of your independent variables ignoring (collapsed across) the other independent variables Interaction effects - how your independent variables affect each other • Example: 2 x 2 design, factors A and B B 1 • Interaction: • At A 1, B 1 is bigger than B 2 • At A 2, B 1 and B 2 don’t differ Dependent Variable n B 2 A 1 A Everyday interaction = “it depends on …” Factorial experiments
n Rate how much you would want to see a new movie (1 no interest, 5 high interest): n n Hail, Caesar! – new Cohen Brothers movie in 2016 (Feb. 5) Ask men and women – looking for an effect of gender Not much of a difference: no effect of gender Interaction effects
n Maybe the gender effect depends on whether you know who is in the movie. So you add another factor: n Suppose that George Clooney or Scarlett Johansson might star. You rate the preference if he were to star and if he were not to star. Effect of gender depends on whether George or Scarlett stars in the movie or not This is an interaction Interaction effects A video lecture from The. Psych. Files. com podcast
n The complexity & number of outcomes increases: • A = main effect of factor A • B = main effect of factor B • AB = interaction of A and B • With 2 factors there are 8 basic possible patterns of results: 1) No effects at all 2) A only 3) B only 4) AB only 5) A & B 6) A & AB 7) B & AB 8) A & B & AB Results of a 2 x 2 factorial design
A 1 B 2 A 2 Condition mean A 1 B 1 A 2 B 1 Condition mean A 1 B 2 A 2 B 2 A 1 mean Interaction of AB What’s the effect of A at B 1? What’s the effect of A at B 2? B 1 mean B 2 mean A 2 mean Main effect of A Marginal means 2 x 2 factorial design Main effect of B
A 1 A 2 Main Effect of B B 1 30 60 45 B 2 30 60 B 45 Dependent Variable A Main Effect of A Main effect of B Interaction of A x B B 1 B 2 A 1 A ✓ X X Examples of outcomes
B 1 A 2 Main Effect of B 60 60 60 B B 2 30 30 45 45 30 Dependent Variable A Main Effect of A B 1 B 2 A 1 A Main effect of A X Main effect of B ✓ Interaction of A x B X Examples of outcomes
B 1 A 2 Main Effect of B 60 30 45 30 60 45 45 B B 2 45 Dependent Variable A Main Effect of A B 1 B 2 A 1 A Main effect of A X Main effect of B X Interaction of A x B ✓ Examples of outcomes
B 1 A 2 Main Effect of B 30 60 45 30 30 30 45 B B 2 30 Dependent Variable A Main Effect of A Main effect of B Interaction of A x B B 1 B 2 A 1 A ✓ ✓ ✓ Examples of outcomes
Let’s add another variable: test difficulty. test performance easy medium hard low mod anxiety high Test difficulty anxiety hard medium easy low mod high 35 80 35 65 80 80 80 60 main effect of anxiety Interaction ? Yes: effect of anxiety depends on level of test difficulty Anxiety and Test Performance main effect of difficulty 50 70 80
n Advantages n Interaction effects – Consider the interaction effects before trying to interpret the main effects – Adding factors decreases the variability – Because you’re controlling more of the variables that influence the dependent variable – This increases the statistical Power of the statistical tests – Increases generalizability of the results – Because you have a situation closer to the real world (where all sorts of variables are interacting) Factorial Designs
n Disadvantages n n n Experiments become very large, and unwieldy The statistical analyses get much more complex Interpretation of the results can get hard • In particular for higher-order interactions • Higher-order interactions (when you have more than two interactions, e. g. , ABC). Factorial Designs
n Consider the results of our class experiment n Main effect of word type n Main effect of depth of processing n No Interaction between word type and depth of processing Dr. Kahn's reporting stats page Factorial designs
n n So far we’ve covered a lot of the about details experiments generally Now let’s consider some specific experimental designs. n n Some bad (but common) designs Some good designs • • 1 Factor, two levels 1 Factor, multi-levels Factorial (more than 1 factor) Between & within factors Experimental designs
n What is the effect of presenting words in color on memory for those words? n So you present lists of words for recall either in color or in black-and-white. Clock Chair Cab n Clock Chair Cab Two different designs to examine this question Example
n Between-Groups Factor § 2 -levels § Each of the participants is in only one level of the IV levels Colored words Clock Chair Cab participants Test BW words Clock Chair Cab
n Within-Groups Factor § Sometimes called “repeated measures” design § 2 -levels, All of the participants are in both levels of the IV levels participants Colored words Clock Chair Cab Test BW words Clock Chair Cab Test
n Between-subjects designs n Each participant participates in one and only one condition of the experiment. n Within-subjects designs n All participants participate in all of the conditions of the experiment. Colored words Test participants BW words participants Colored words Test BW words Test Between vs. Within Subjects Designs
n Between-subjects designs n Each participant participates in one and only one condition of the experiment. n Within-subjects designs n All participants participate in all of the conditions of the experiment. Colored words Test participants BW words participants Colored words Test BW words Test Between vs. Within Subjects Designs
n Clock Colored words Chair Cab Advantages: Test participants BW Clock words Chair Cab n Independence of groups (levels of the IV) • Harder to guess what the experiment is about without experiencing the other levels of IV • Exposure to different levels of the independent variable(s) cannot “contaminate” the dependent variable • Sometimes this is a ‘must, ’ because you can’t reverse the effects of prior exposure to other levels of the IV • No order effects to worry about • Counterbalancing is not required Between subjects designs
n Disadvantages Clock Colored words Chair Cab Test participants BW Clock words Chair Cab n Individual differences between the people in the groups • Excessive variability • Non-Equivalent groups Between subjects designs
n The groups are composed of different individuals participants Colored words BW words Individual differences Test
n The groups are composed of different individuals participants n Colored words BW words Excessive variability due to individual differences n Test Harder to detect the effect of the IV if there is one Individual differences NR R R
n The groups are composed of different individuals participants n Colored words Test BW words Non-Equivalent groups (possible confound) n The groups may differ not only because of the IV, but also because the groups are composed of different individuals Individual differences
n Strive for Equivalent groups n n n Created equally - use the same process to create both groups Treated equally - keep the experience as similar as possible for the two groups Composed of equivalent individuals • Random assignment to groups - eliminate bias • Matching groups - match each individuals in one group to an individual in the other group on relevant characteristics Dealing with Individual Differences
Group A Red Short 21 yrs Blue tall 23 yrs Green average 22 yrs Brown tall 22 yrs Group B matched Matching groups Red Short 21 yrs Blue tall 23 yrs Green average 22 yrs Brown tall 22 yrs n Matched groups n n Trying to create equivalent groups Also trying to reduce some of the overall variability • Eliminating variability from the variables that you matched people on Color Height Age
n Between-subjects designs n Each participant participates in one and only one condition of the experiment. n Within-subjects designs n All participants participate in all of the conditions of the experiment. Colored words Test participants Colored words Test BW words Between vs. Within Subjects Designs
n Advantages: n Don’t have to worry about individual differences • Same people in all the conditions • Variability between conditions is smaller (statistical advantage) n Fewer participants are required Within subjects designs
n Disadvantages n n Range effects Order effects: • Carry-over effects • Progressive error • Counterbalancing is probably necessary to address these order effects Within subjects designs
n Range effects – (context effects) can cause a problem n n The range of values for your levels may impact performance (typically best performance in middle of range). Since all the participants get the full range of possible values, they may “adapt” their performance (the DV) to this range. Within subjects designs
n Carry-over effects n n Transfer between conditions is possible Effects may persist from one condition into another • e. g. Alcohol vs no alcohol experiment on the effects on hand-eye coordination. Hard to know how long the effects of alcohol may persist. Condition 1 Condition 2 test Order effects How long do we wait for the effects to wear off? test
n Progressive error n n Practice effects – improvement due to repeated practice Fatigue effects – performance deteriorates as participants get bored, tired, distracted Order effects
n Counterbalancing is probably necessary n This is used to control for “order effects” • Ideally, use every possible order • (n!, e. g. , AB = 2! = 2 orders; ABC = 3! = 6 orders, ABCD = 4! = 24 orders, etc ). n All counterbalancing assumes Symmetrical Transfer • The assumption that AB and BA have reverse effects and thus cancel out in a counterbalanced design Dealing with order effects
n Simple case n n Two conditions A & B Two counterbalanced orders: • AB • BA Colored words Test BW words Test Colored words Test participants Counterbalancing
n Often it is not practical to use every possible ordering n Partial counterbalancing • Latin square designs – a form of partial counterbalancing, so that each group of trials occur in each position an equal number of times Counterbalancing
n Example: consider four conditions Recall: ABCD = 4! = 24 possible orders 1) Unbalanced Latin square: each condition appears in each position (4 orders) n Order 1 A B C D Order 2 Order 3 B C D A B Order 4 D A B C Partial counterbalancing
n Example: consider four conditions Recall: ABCD = 4! = 24 possible orders 2) Balanced Latin square: each condition appears before and after all others (8 orders) n A B C D A B D C B C D A B C A D C D A B C D B A D A B C D A C B Partial counterbalancing
n Mixed factorial designs n n Treat some factors as within-subjects (participants get all levels of that factor) and others as between-subjects (each level of this factor gets a different group of participants). This only works with factorial (multi-factor) designs Mixed factorial designs
- Slides: 95