Jody Culham Brain and Mind Institute Department of

Jody Culham Brain and Mind Institute Department of Psychology Western University http: //www. fmri 4 newbies. com/ Basics of Experimental Design for f. MRI: Event-Related Designs Last Update: October 27, 2014 Last Course: Psychology 9223, F 2014, Western University

Event-Related Averaging (can be used for block or event-related designs)

Event-Related Averaging In this example an “event” is the start of a block In single-trial designs, an event may be the start of a single trial First, we compute an event related average for the blue condition • Define a time window before (2 volumes) and after (15 volumes) the event • Extract the time course for every event (here there are four events in one run) • Average the time courses across all the events

Event-Related Averaging Second, we compute an event related average for the gray condition

Event-Related Averaging Third, we can plot the average ERA for the blue and gray conditions on the same graph

Event-Related Averaging in BV Define which subjects/runs to include Set time window Define which conditions to average (usually exclude baseline) Determine how you want to define the y-axis values, including zero We can tell BV where to put the y=0 baseline. Here it’s the average of the two circled data points at x=0.

But what if the curves don’t have the same starting point? In the data shown, the curves started at the same level, as we expect they should because both conditions were always preceded by a resting baseline period But what if the data looked like this? …or this?

Epoch-based averaging FILE-BASED AVERAGING: zero baseline determined across all conditions (for 0 to 0: points in red circles) In the latter two cases, we could simply shift the curves so they all start from the same (zero) baseline EPOCH-BASED AVERAGING: zero baselines are specific to each epoch

File-based vs. Epoch-based Averaging time courses may start at different points because different event histories or noise e. g. , set EACH curve such that at time=0, %BSC=0 0 File-based Averaging Epoch-based Averaging • • • zero is based on average starting point of all curves works best when low frequencies have been filtered out of your data similar to what your GLM stats are testing • each curve starts at zero can be risky with noisy data only use it if trial histories are counterbalanced or ITI is very long can yield very different conclusions than GLM stats

The Problem of Trial History: Cartoon Example Hypothetical Data Perfect HRF Model for Event 1 Perfect HRF Model for Event 2 What β weights would result?

But remember the HRF may not fit our data well Handwerker et al. , 2004, Neuroimage

The Problem of Trial History: Cartoon Example Hypothetical Data Perfect HRF Model for Event 1 Perfect HRF Model for Event 2 Hypothetical Data Imperfect HRF Model for Event 1 Imperfect HRF Model for Event 2 What β weights would result?

The Problem of Trial History • Our events (epochs or single events) are packed closely together, employing an imperfect HRF can lead to misestimates of beta weights Possible solutions • widely spaced events to allow activation to return to baseline between events • perfectly counterbalanced orders of events to avoid systematic differences in trial history between conditions – remember conditions need to precede themselves too • subject-specific HRF models – may still be imperfect • solutions that don’t assume an HRF deconvolution

Basics of Event-Related Designs

Block Designs = trial of one type (e. g. , face image) = trial of another type (e. g. , place image) Block Design Early Assumption: Because the hemodynamic response delays and blurs the response to activation, the temporal resolution of f. MRI is limited. Positive BOLD response BOLD Response (% signal change) WRONG!!!!! 3 2 Overshoot 1 Initial Dip 0 Post-stimulus Undershoot Time Stimulus

What are the temporal limits? What is the briefest stimulus that f. MRI can detect? Blamire et al. (1992): 2 sec Bandettini (1993): 0. 5 sec Savoy et al (1995): 34 msec 2 s stimuli single events Data: Blamire et al. , 1992, PNAS Figure: Huettel, Song & Mc. Carthy, 2004 Data: Robert Savoy & Kathy O’Craven Figure: Rosen et al. , 1998, PNAS Although the shape of the HRF delayed and blurred, it is predictable. Event-related potentials (ERPs) are based on averaging small responses over many trials. Can we do the same thing with f. MRI?

Predictor Height Depends on Stimulus Duration

= trial of one type (e. g. , face image) = trial of another type (e. g. , place image) Block Design Slow ER Design Rapid Jittered ER Design Mixed Design Types

Detection vs. Estimation % Signal Change • detection: determination of whether activity of a given voxel (or region) changes in response to the experimental manipulation 1 0 0 4 8 12 • estimation: measurement of the time course within an active voxel in response to the experimental manipulation Time (sec) Definitions modified from: Huettel, Song & Mc. Carthy, 2004, Functional Magnetic Resonance Imaging

Block Designs: Poor Estimation Huettel, Song & Mc. Carthy, 2004, Functional Magnetic Resonance Imaging

Pros & Cons of Block Designs Pros • high detection power • has been the most widely used approach for f. MRI studies • accurate estimation of hemodynamic response function is not as critical as with event-related designs Cons • poor estimation power • subjects get into a mental set for a block • very predictable for subject • can’t look at effects of single events (e. g. , correct vs. incorrect trials, remembered vs. forgotten items) • becomes unmanagable with too many conditions (e. g. , more than 4 conditions + baseline)

Slow Event-Related Designs Slow ER Design

Convolution of Single Trials Neuronal Activity BOLD Signal Haemodynamic Function Time Slide from Matt Brown

BOLD Summates Neuronal Activity Slide adapted from Matt Brown BOLD Signal

Slow Event-Related Design: Constant ITI Block 2 s stim vary ISI Bandettini et al. (2000) What is the optimal trial spacing (duration + intertrial interval, ITI) for a Spaced Mixed Trial design with constant stimulus duration? Event-related average Source: Bandettini et al. , 2000

Optimal Constant ITI Source: Bandettini et al. , 2000 Brief (< 2 sec) stimuli: optimal trial spacing = 12 sec For longer stimuli: optimal trial spacing = 8 + 2*stimulus duration Effective loss in power of slow event-related design: = -35% i. e. , for 6 minutes of block design, run ~9 min slow ER design

Trial to Trial Variability Huettel, Song & Mc. Carthy, 2004, Functional Magnetic Resonance Imaging

How Many Trials Do You Need? Huettel, Song & Mc. Carthy, 2004, Functional Magnetic Resonance Imaging • • • standard error of the mean varies with square root of number of trials Number of trials needed will vary with effect size Function begins to asymptote around 15 trials

Effect of Adding Trials Huettel, Song & Mc. Carthy, 2004, Functional Magnetic Resonance Imaging

Pros & Cons of Slow ER Designs Pros • excellent estimation • useful for studies with delay periods • very useful for designs with motion artifacts (grasping, swallowing, speech) because you can tease out artifacts • analysis is straightforward Example: Delayed Hand Actions (Singhal et al. , 2013) Visual Delay Action Response Execution Grasp Go (G) Reach Go (R) Grasp Stop (GS) Reach Stop (RS) Actionrelated artifact Really long delay: 18 s Cons • poor detection power because you get very few trials per condition by spending most of your sampling power on estimating the baseline • subjects can get VERY bored and sleepy with long inter-trial intervals Effect of this design on our subject

Pros & Cons of Slow ER Designs Pros • excellent estimation • useful for studies with delay periods • very useful for designs with motion artifacts (grasping, swallowing, speech) because you can tease out artifacts • analysis is straightforward Example: Delayed Hand Actions (Culham, 2004 vs. Singhal et al. , 2013) 10 -s delay Cons • poor detection power because you get very few trials per condition by spending most of your sampling power on estimating the baseline • subjects can get VERY bored and sleepy with long inter-trial intervals 18 -s delay Effect of this design on our subject

“Do You Wanna Go Faster? ” Rapid Jittered ER Design • Yes, but we have to test assumptions regarding linearity of BOLD signal first

Linearity of BOLD response Linearity: “Do things add up? ” red = 2 - 1 green = 3 - 2 Sync each trial response to start of trial Not quite linear but good enough! Source: Dale & Buckner, 1997

Linearity is okay for events every ~4+ s

Why isn’t BOLD totally linear? In part because neurons aren’t totally linear either • “Phasic” (or “transient”) neural responses • Adaptation or habituation… stay tuned Spikes/ms – May depend on factors like stimulus duration and stimulus intensity Time (ms) Ganmor et al. , 2010, Neuron

Optimal Rapid ITI Source: Dale & Buckner, 1997 Rapid Mixed Trial Designs Short ITIs (~2 sec) are best for detection power Do you know why?

Efficiency (Power)

Two Approaches • Detection – find the blobs – Business as usual – Model predicted activation using square-wave predictor functions convolved with assumed HRF – Extract beta weights for each condition; Contrast betas – Drawback: Because trials are packed so closely together, any misestimates of the HRF will lead to imperfect GLM predictors and betas • Estimation – find the time course – make a model that can estimate the volume-by-volume time courses through a deconvolution of the signal

BOLD Overlap With Regular Trial Spacing Neuronal activity from TWO event types with constant ITI Partial tetanus BOLD activity from two event types Slide from Matt Brown

BOLD Overlap with Jittering Neuronal activity from closely-spaced, jittered events BOLD activity from closely-spaced, jittered events Slide from Matt Brown

Fast f. MRI Detection A) BOLD Signal B) Individual Haemodynamic Components C) 2 Predictor Curves for use with GLM (summation of B) Slide from Matt Brown

Why jitter? • Yields larger fluctuations in signal When pink is on, yellow is off pink and yellow are anticorrelated Includes cases when both pink and yellow are off less anticorrelation • Without jittering predictors from different trial types are strongly anticorrelated – As we know, the GLM doesn’t do so well when predictors are correlated (or anticorrelated)

GLM: Tutorial data • Just as in the GLM for a block design, we have one predictor for each condition other than the baseline

GLM: Output Faces > Baseline

How to Jitter = trial of one type (e. g. , face image) = trial of another type (e. g. , place image) TD = 2 s ITI = 0 s SOA = 2 s TD = 2 s ITI = 4 s SOA = 6 s Vary Intertrial Interval (ITI) • • Stimulus Onset Asynchrony (SOA) = ITI + Trial Duration may want to make TD (e. g. , 2 s) and ITI durations (e. g. , 0, 2, 4, 6 s) an integer multiple of TR (e. g. , 2 s) for ease of creating protocol files Flat Distribution Frequency of ITIs in Each Condition Another way to think about it… Exponential Distribution Frequency of ITIs in Each Condition 0 2 4 6 ITI (s) Include “Null” Trials = null trial (nothing happens) • • Can randomize or counterbalance distribution of three trial types Outcome may be similar to varying ISI

Assumption of HRF is More Problematic for Event-Related Designs We know that the standard two-gamma HRF is a mediocre approximation for individual Ss’ HRFs Handwerker et al. , 2004, Neuroimage We know this isn’t such a big deal for block designs but it is a bigger issue for rapid event -related designs.

One Approach to Estimation: Counterbalanced Trial Orders • Each condition must have the same history for preceding trials so that trial history subtracts out in comparisons • For example if you have a sequence of Face, Place and Object trials (e. g. , FPFOPPOF…), with 30 trials for each condition, you could make sure that the breakdown of trials (yellow) with respect to the preceding trial (blue) was as follows: • • • …Face x 10 …Place Face x 10 …Object Face x 10 • • • …Face Place x 10 …Place x 10 …Object Place x 10 • • • …Face Object x 10 …Place Object x 10 …Object x 10 • Most counterbalancing algorithms do not control for trial history beyond the preceding one or two items

Algorithms for Picking Efficient Designs Optseq 2

Algorithms for Picking Efficient Designs Genetic Algorithms

You Can’t Always Counterbalance You may be interested in variables for which you can not control trial sequence e. g. , subject errors can mess up your counterbalancing e. g. , memory experiments: remembered vs. forgotten items e. g. , decision-making: choice 1 vs. choice 2 e. g. , correlations with behavioral ratings

Post Hoc Trial Sorting Example Wagner et al. , 1998, Science

Pros & Cons of Applying Standard GLM to Rapid-ER Designs Pros • high detection power • trials can be put in unpredictable order • subjects don’t get so bored Cons and Caveats • reduced detection compared to block designs • requires stronger assumptions about linearity – BOLD is non-linear with inter-event intervals < 6 sec. – Nonlinearity becomes severe under 2 sec. • errors in HRF model can introduce errors in activation estimates

Design Types Mixed Design = trial of one type (e. g. , face image) = trial of another type (e. g. , place image)

Example of Mixed Design Otten, Henson, & Rugg, 2002, Nature Neuroscience • used short task blocks in which subjects encoded words into memory • In some areas, mean level of activity for a block predicted retrieval success

Pros and Cons of Mixed Designs Pros • allow researchers to distinguish between state-related and item-related activation Cons • sensitive to errors in HRF modelling

Deconvolution of Event-Related Designs Using the GLM

Two Approaches • Detection – find the blobs – Business as usual – Model predicted activation using square-wave predictor functions convolved with assumed HRF – Extract beta weights for each condition; Contrast betas – Drawback: Because trials are packed so closely together, any misestimates of the HRF will lead to imperfect GLM predictors and betas • Estimation – find the time course – make a model that can estimate the volume-by-volume time courses through a deconvolution of the signal

Convolution of Single Trials Neuronal Activity BOLD Signal Haemodynamic Function Time Slide from Matt Brown

Fast f. MRI Detection A) BOLD Signal B) Individual Haemodynamic Components C) 2 Predictor Curves for use with GLM (summation of B) Slide from Matt Brown

DEconvolution of Single Trials Neuronal Activity BOLD Signal Haemodynamic Function Time Slide from Matt Brown

Deconvolution Example • time course from 4 trials of two types (pink, blue) in a “jittered” design

Summed Activation

Single Stick Predictor (stick predictors are also called finite impulse response (FIR) functions) • single predictor first volume of pink trial type

Predictors for Pink Trial Type • set of 12 predictors for subsequent volumes of pink trial type • need enough predictors to cover unfolding of HRF (depends on TR)

Predictor Matrix • Diagonal filled with 1’s . . .

Predictors for Pink Trial Type

Predictors for the Blue Trial Type

Predictor x Beta Weights for Pink Trial Type • sequence of beta weights for one trial type yields an estimate of the average activation (including HRF)

Predictor x Beta Weights for Blue Trial Type • height of beta weights indicates amplitude of response (higher betas = larger response)

Overview

A Little Math Problem x+y+z=9 What are x and y and z?

Another Little Math Problem x+y=6 x+z=7 z+y=5 What are x and y and z?

Solution to Another Little Math Problem x+y=6 x+z=7 z+y=5 What are x and y and z? y=6 -x z=7 -x (7 -x) + (6 -x) = 5 13 – 2 x = 5 2 x = 13 – 5 = 8 x=4 y=6–x=6– 4=2 z=7–x=7– 4=3

Comparisons of Two Problems x+y+z=9 three unknowns one equation x+y=6 x+z=7 z+y=5 three unknowns three equations unsolvable!

Why Jitter? Solvable Deconvolution Miezen et al. 2000

Decon GLM To find areas that respond to all stims, we could fill the contrast column with +’s …but that would be kind of dumb because we don’t expect all time points to be highly positive, just the ones at the peak of the HRF 14 predictors (time points) for Cues 14 predictors (time points) for Face trials 14 predictors (time points) for House trials 14 predictors (time points) for Object trials

Contrasts on Peak Volumes We can search for areas that show activation at the peak (e. g. , 3 -5 s after stimulus onset

Results: Peaks > Baseline

Graph beta weights for spike predictors Get deconvolution time course Why go to all this bother? Why not just generate an eventrelated average? …

Pros and Cons of Deconvolution Pros: • Produces time course that dissociates activation from trial history • Does not assume specific shape for hemodynamic function • Robust against trial history biases (though not immune to it) • Compound trial types possible (e. g. , stimulus-delay-response) – may wish to include “half-trials” (stimulus without response) Cons: • Complicated • Quite sensitive to noise • Contrasts don’t take HRF fully into account, they just examine peaks

Not Mutually Exclusive • Convolution and deconvolution GLMs are not mutually exclusive • Example – use convolution GLM to detect blobs, use deconvolution to estimate time courses

= trial of one type (e. g. , face image) = trial of another type (e. g. , place image) Block Design Slow ER Design Rapid Jittered ER Design Mixed Design Types

Take-home message • Block designs – Great detection, poor estimation • Slow ER designs – Poor detection, great estimation • Fast ER designs – Good detection, very good estimation – Excellent choice for designs where predictability is undesirable or where you want to factor in subject’s behavior

To Localize or Not to Localise?

Hypothetical Example • The extrastriate body area responds more to human bodies than to other categories of visual stimuli (e. g. , human faces, places, objects) • You want to know if the extrastriate body area responds more to animal bodies vs. animal faces

Voxelwise Analysis > • Perform GLM for a particular contrast at every voxel in the brain • If you do see activation in the lateral occipitotemporal cortex, is it really EBA? • If you don’t see activation, maybe your statistical test was too conservative because of the correction for multiple comparisons (e. g. , Type II error)

Region of Interest Analysis • One solution is to define your regions independently • Then you can test your contrast in that region at good ol’ p <. 05

ROIs can be defined by functional and/or anatomical criteria images from O’Reilly et al. , 2012, SCAN Functional ROI Anatomical Functional-Anatomical ROI

Localizer can be built into same run as experimental conditions or can be done separately Step 1: Localize ROI using voxelwise contrasts • Human bodies > human faces • Identify EBA Step 2: Test EBA on contrast of interest • Animal bodies > animal faces • Can use simple p <. 05

ROIs should be defined independently • Maybe what we really want to know is whether the difference between human bodies and faces is greater than the difference between animal bodies and faces Human Animal Face Body

Ideally ROIs should be defined independently One option • put all four conditions into one run • Step 1: Identify EBA by human body > human face • Step 2: Test interaction Human Animal Face Body However, this suffers from the non-independence error

Non-independence Error Let’s say on average, this is what really happens in EBA as a whole (ground truth) Human Animal Face Body

Non-independence Error • But we know that there is also noise in the measurement such that different voxels may have slight differences in effects Human Animal Face Voxel 1 Body Face Body Voxel 2 Face Body Voxel 3 • Based on our selection criteria, we’d be likely to include voxel 1 and 2 in our ROI but not voxel 3 • Thus we may erroneously see a significant interaction based on our selection bias

Independent Runs • Because of the non-independence error, we may want to have a separate independent run Localizer Experimental Run • Benefit: Localizer is now based on data independent from experimental run • Cost: We have some redundancy between the localizer and experimental run

ROI Defined at Group or Individual Level Group Analysis Individual Analysis S 1 S 2 S 3 … • The ability to define subject-specific ROIs is one of the advantages of the ROI approach

To Localize or Not to Localise? Neuroimagers can’t even agree how to SPELL localiser/localizer!

Methodological Fundamentalism The latest review I received…

Pros and Cons: Voxelwise Approach Benefits • Require no prior hypotheses about areas involved • Include entire brain • May identify subregions of known areas that are implicated in a function • Doesn’t require independent data set Drawbacks • Requires conservative corrections for multiple comparisons • vulnerable to Type II errors • Neglects individual differences in brain regions • poor for some types of studies (e. g. , topographic areas) • Can lose spatial resolution with intersubject averaging • Requires speculation about areas involved

Pros and Cons: ROI Approach Benefits • Extraction of ROI data can be subjected to simple stats • Elimination of multiple comparisons problem greatly improves statistical power (e. g. , p <. 05) • Hypothesis-driven • Useful when hypotheses are motivated by other techniques (e. g. , electrophysiology) in specific brain regions • ROI is not smeared due to intersubject averaging • Important for discriminating abutting areas (e. g. , V 1/V 2) • Can be useful for dissecting factorial design data in an unbiased manner Drawbacks • Neglects other areas that may play a fundamental role • If multiple ROIs need to be considered, you can spend a lot of scan time collecting localizer data (thus limiting the time available for experimental runs) • Works best for reliable and robust areas with unambiguous definitions • Sometimes you can’t find an ROI in some subjects • Selection of ROIs can be highly subjective and error-prone

ROI and Voxelwise Analyses are NOT mutually exclusive • You can decide based on the situation/hypotheses • You can do both ROI analyses and voxelwise analyses – ROI analyses for well-defined key regions – Voxelwise analyses to see if other regions are also involved • Ideally, the conclusions will not differ • If the conclusions do differ, there may be sensible reasons – Effect in ROI but not voxelwise • perhaps region is highly variable in stereotaxic location between subjects • perhaps voxelwise approach is not statistically powerful enough – Effect in voxelwise but not ROI • perhaps ROI is not homogenous or is context-specific