Time Series Analysis of f MRI Douglas N

  • Slides: 58
Download presentation
Time Series Analysis of f. MRI Douglas N. Greve greve@nmr. mgh. harvard. edu

Time Series Analysis of f. MRI Douglas N. Greve greve@nmr. mgh. harvard. edu

Overview • Hemodynamic Response Function (HRF) Model • Modeling the Entire BOLD Signal •

Overview • Hemodynamic Response Function (HRF) Model • Modeling the Entire BOLD Signal • Contrasts • Noise Propagation • Inference • Design Efficiency • HRF Model Errors

Modeling of the Hemodynamic Response Function (HRF) and the Entire BOLD Signal

Modeling of the Hemodynamic Response Function (HRF) and the Entire BOLD Signal

Neuro-vascular Coupling Stimulus Dispersion Amplitude Delay • Delay • Dispersion – spreading out •

Neuro-vascular Coupling Stimulus Dispersion Amplitude Delay • Delay • Dispersion – spreading out • Amplitude

Fundamental Concept Amplitude of neural firing is monotonically related to the HRF amplitude. Interpretation:

Fundamental Concept Amplitude of neural firing is monotonically related to the HRF amplitude. Interpretation: if the HRF amplitude for one stimulus is greater than that of another, then the neural firing to that stimulus was greater. Emphasis is on estimating the HRF amplitude by constructing and fitting models of the HRF and the entire BOLD signal.

Parametric Models of the HRF Gamma Model: Model converts a neural “impulse” into a

Parametric Models of the HRF Gamma Model: Model converts a neural “impulse” into a BOLD signal (convolution) Parameters: • D Delay • t Dispersion • b Amplitude Dale and Buckner, 1997, HBM 5: 329 -340. D=2. 25 s, t=1. 25 s

Linear System Theory • Scaling: the output amplitude changes linearly with the input amplitude.

Linear System Theory • Scaling: the output amplitude changes linearly with the input amplitude. Eg, doubling the input doubles the output • Superposition: the response to two inputs is the sum of the response the individual inputs presented alone.

Linear System Theory: Scaling Same shape, ½ amplitude If the amplitude of neural firing

Linear System Theory: Scaling Same shape, ½ amplitude If the amplitude of neural firing drops by two, the HRF amplitude drops by two but keeps the same shape.

Linear System Theory: Superposition Each stimulus creates a response. When the responses overlap, the

Linear System Theory: Superposition Each stimulus creates a response. When the responses overlap, the observable is the sum of the individual waveforms.

Estimating the Amplitude • Assume shape (ie, know D and t) • Amplitude is

Estimating the Amplitude • Assume shape (ie, know D and t) • Amplitude is unknown • Stimulus presented at known times (eg, t=0) • No noise (for now) Unknown Amplitude Reference Amplitude = 1

Estimating the Amplitude y= bm y xh bm=Slope=Amp xh = h(t 1, D, t)

Estimating the Amplitude y= bm y xh bm=Slope=Amp xh = h(t 1, D, t) h(t 2, D, t) h(t 3, D, t) … 16 x 1 Forward Model: y=xh*bm 16 Equations 1 Unknown 16 x 1

Estimating the Amplitude with Noise y= y 16 x 1 xh bm=Slope=Amp xh =

Estimating the Amplitude with Noise y= y 16 x 1 xh bm=Slope=Amp xh = h(t 1, D, t) h(t 2, D, t) h(t 3, D, t) … Forward Model: y=xh*bm 16 Equations 1 Unknown 16 x 1

Multiple Stimulus Presentations y= All have same amp. xh xh = 48 x 1

Multiple Stimulus Presentations y= All have same amp. xh xh = 48 x 1 h(t 1 -ts 1, D, t) h(t 2 -ts 1, D, t) … h(t 3 -ts 2, D, t) … 48 x 1 tsi – onset time of stimulus i Forward Model: y=xh*bm 48 Equations (all data) 1 Unknown

Estimating Amplitude and Offset y y= bm X= x 48 x 1 bo xh

Estimating Amplitude and Offset y y= bm X= x 48 x 1 bo xh bm=Slope=Amp bo=Unknown Intercept 1 1 h 1 … 48 x 2 Forward Model: y=X*b b=[bm bo]T 48 Equations 2 Unknowns bm is deviation from baseline Cf: “demeaning”

Estimating Amplitude, Offset, and Drift y bd bo X= x y= bm xh bm

Estimating Amplitude, Offset, and Drift y bd bo X= x y= bm xh bm is deviation from drift 1 1 h 1 1 48 x 1 1 2 3 4 48 x 3 Forward Model: y=X*b b=[bm bo bd]T 48 Equations 3 Unknowns

Multiple Conditions y X = xh 1 xh 2 y= 32 x 1 xh

Multiple Conditions y X = xh 1 xh 2 y= 32 x 1 xh 2 32 x 2 Forward Model: y=X*b b=[bm 1 bm 2]T 32 Equations 2 Unknowns

Putting it all together y= X= Ntpx 1 1 1 h 2 1 …

Putting it all together y= X= Ntpx 1 1 1 h 2 1 … xh 1 x 1 Ty 1 Tz 1 Rx 1 Ry 1 Rz 1 2 Tx 2 Ty 2 Tz 2 Rx 2 Ry 2 Rz 2 3 Tx 3 Ty 3 Tz 3 Rx 3 Ry 3 Rz 3 … … … … Ntpx 10 Txi, Tyi, Tzi, Rxi, Ryi, Rzi – Motion Correction Translation and Rotation at time point i Forward Model: y=X*b Ntp Equations, 10 Unknowns, DOF=Ntp-10 b=[bm 1 bm 2 bo bd b. Tx b. Ty b. Tz b. Rx b. Ry b. Rz]T

BOLD Signal Modeling Summary • Raw time course (observable, y) is modeled as a

BOLD Signal Modeling Summary • Raw time course (observable, y) is modeled as a weighted linear sum of factors. • Weights (bs) are unknown • Components (regressor) are known • Components form columns in design matrix • Each column means something • Components may be task HRF waveforms • Components may be “nuisance” regressors

Contrasts The embodiment of a hypothesis.

Contrasts The embodiment of a hypothesis.

Contrast Matrix (Univariate) • Two Conditions (E 1 and E 2) • X =

Contrast Matrix (Univariate) • Two Conditions (E 1 and E 2) • X = [xh. E 1 xh. E 2], b=[b. E 1 b. E 2]T • Hypothesis: Response to E 1 and E 2 are different • Null Hypothesis (Ho): b. E 1 –b. E 2= 0 Statistic: g = C*b = [c 1 c 2]*[b. E 1 b. E 2]T = 0 g = c 1* b. E 1 + c 2*b. E 2=0 c 1= +1, c 2 = -1 C = [+1 -1] 1 x 2 g>0 Means E 1 > E 2 g<0 Means E 2 > E 1 Contrast Matrix must have same number of columns as X = number of parameters in b

Four Conditions X = xh. E 1 xh. E 2 xh. E 3 xh.

Four Conditions X = xh. E 1 xh. E 2 xh. E 3 xh. E 4 b=[b. E 1 b. E 2 b. E 3 b. E 4]T Ho: E 1=E 2, Ignore E 3 and E 4 C=[+1 -1 0 0], S=0 Ho: E 1=E 3, Ignore E 2 and E 4 C=[+1 0 -1 0], S=0 Ho: E 1=0 (main effect of E 1), Ignore E 2, E 3, E 4 C=[+1 0 0 0], S!=0

Four Conditions X = xh. E 1 xh. E 2 xh. E 3 xh.

Four Conditions X = xh. E 1 xh. E 2 xh. E 3 xh. E 4 b=[b. E 1 b. E 2 b. E 3 b. E 4]T Ho: E 1+E 2 = E 3+E 4 C=[+1 +1 -1 -1], S=0 Ho: E 1+E 2 = E 3, Ignore E 4 C=[+1 +1 -1 0], S=1 Example: Multimodal E 1 is Auditory presented by itself E 2 is Visual presented by itself E 3 is Simultaneous Aud and Vis

Multivariate Contrasts X = xh. E 1 xh. E 2 xh. E 3 xh.

Multivariate Contrasts X = xh. E 1 xh. E 2 xh. E 3 xh. E 4 b=[b. E 1 b. E 2 b. E 3 b. E 4]T Ho: E 1=E 2=0, Ignore E 3 and E 4 E 1=0 AND E 2=0 Reject Ho if: E 1!=0 OR E 2!=0 C 10=[+1 0 0 0] Tests for E 1=0 C 20=[ 0 +1 0 0] Tests for E 2=0 Test both simultaneously: What’s wrong with +1 0 0 0 C=[+1 +1 0 0]? C= 0 +1 0 0 2 x 4

Multivariate vs Multi-Univariate X = xh. E 1 xh. E 2 xh. E 3

Multivariate vs Multi-Univariate X = xh. E 1 xh. E 2 xh. E 3 xh. E 4 b=[b. E 1 b. E 2 b. E 3 b. E 4]T Reject Ho if: E 1!=0 OR E 2!=0 Two separate univariate tests: C 10=[+1 0 0 0] Tests for E 1=0 C 20=[ 0 +1 0 0] Tests for E 2=0 Problem: neither may individually exceed threshold, but both in combination may be unlikely under the null.

Multivariate Contrasts X = xh. E 1 xh. E 2 xh. E 3 xh.

Multivariate Contrasts X = xh. E 1 xh. E 2 xh. E 3 xh. E 4 b=[b. E 1 b. E 2 b. E 3 b. E 4]T Ho: E 1=E 2=E 3, Ignore E 4 C 12=[+1 -1 0 0] Tests for E 1=E 2 C 13=[+1 0 -1 0] Tests for E 1=E 3 Note: do not need explicit test for E 2=E 3 +1 -1 0 0 C= +1 0 -1 0 2 x 4

Factorial/ANOVA Interaction Contrast • Two Discrete Factors – Gender: Two Levels (M&F) – Handedness:

Factorial/ANOVA Interaction Contrast • Two Discrete Factors – Gender: Two Levels (M&F) – Handedness: Two Levels (L&R) b 4 L • Four Conditions/Four Regressors – MR (b 1), ML (b 2), FR (b 3), FL (b 4) ? R b 2 D 2 b 3 D 1 b 1 M F g = D 1 -D 2=0 g = (b 3 -b 1)- (b 4 -b 2) = -b 1+b 2+ b 3 -b 4 C = [-1 +1 +1 -1] univariate 26

Noise 27

Noise 27

Noise Propagation Experiment 1 Experiment 2 • • Assumed shape is the same Actual

Noise Propagation Experiment 1 Experiment 2 • • Assumed shape is the same Actual shape is the same Noise is different Amplitude estimates (slopes) are different • Measurement noise creates uncertainty in estimates 28

Noise Propagation Null Hypothesis sb btrue • Centered at 2 (true amplitude) • Variance

Noise Propagation Null Hypothesis sb btrue • Centered at 2 (true amplitude) • Variance depends on – noise variance, number of samples, a few other things • Uncertainty • Type II Errors (false negatives) • Type I Errors (false positives) 29

“Student’s” t-Test t-value: t. DOF = Rows(X)-Cols(X) = Time. Points-Parameters Significance (p-value): area under

“Student’s” t-Test t-value: t. DOF = Rows(X)-Cols(X) = Time. Points-Parameters Significance (p-value): area under the curve to the right Probability “under the null” (b. True=0) False Positive Rate (Type I Error) Null Hypothesis sb btrue bˆ How to get sb? 30

Full Model with Noise s=Signal n=Noise y=Observable y – observable s – signal n

Full Model with Noise s=Signal n=Noise y=Observable y – observable s – signal n – noise, Model: Gaussian, 0 -mean, stddev sn, white sn 31

Full Model with Noise 32

Full Model with Noise 32

Contrasts and the Full Model 33

Contrasts and the Full Model 33

Propagation of Noise in the observable gets transferred to the contrast through (XTX)-1. The

Propagation of Noise in the observable gets transferred to the contrast through (XTX)-1. The properties of (XTX) are important! Singularity Invertibility Efficiency Condition 34

Review: Matrix Inverse • M*A = I, then A=M-1 • Complicated in general •

Review: Matrix Inverse • M*A = I, then A=M-1 • Complicated in general • Simple for a 2 x 2 M= m 11 m 12 m 21 m 22 M-1 = 2 x 2 D = m 11* m 22 - m 12* m 21 1 m 22 -m 12 -m m 21 11 D 2 x 2

Review: Invertibility IMPORTANT!!! • Not all matrices are invertible • D=0 • “Singular” M=

Review: Invertibility IMPORTANT!!! • Not all matrices are invertible • D=0 • “Singular” M= D = 1. 0*1. 0 - 2. 0*0. 5 = 1 -1 = 0 1. 0 2. 0 0. 5 1. 0 2 x 2 M-1 = 1 0 1. 0 -2. 0 -0. 5 1. 0 2 x 2

Review: Singularity and “Ill-Conditioned” M= 1. 0 2. 0 0. 5 1. 0 2

Review: Singularity and “Ill-Conditioned” M= 1. 0 2. 0 0. 5 1. 0 2 x 2 • Column 2 = twice Column 1 • Linear Dependence Ill-Conditioned: D is “close” to 0 Relates to efficiency of a GLM. D = 1. 0*1. 0 - 2. 0*0. 5 = 1 -1 = 0

Review: GLM Solution Intercept: b Slope: m y 1 y 2 Age x 1

Review: GLM Solution Intercept: b Slope: m y 1 y 2 Age x 1 x 2 y = X*b b=X-1*y 1 x 1 X = 1 x 2 -x 1 X-1 = 1 D -1 1 D= x 2 -x 1 y 1 1 x 1 b = 1 x 2 * m y 2 Non-invertible if x 1=x 2 Ill-conditioned if x 1 near x 2 Sensitive to noise

Noise Propagation through XTX

Noise Propagation through XTX

XTX for Orthogonal Design A B A xh. B B

XTX for Orthogonal Design A B A xh. B B

Orthogonal Design (Twice as long) A B A B xh. A xh. B A

Orthogonal Design (Twice as long) A B A B xh. A xh. B A B

XTX for Fully Co-linear Design A+V A+V Auditory and Visual Presented Simultaneously xh. A

XTX for Fully Co-linear Design A+V A+V Auditory and Visual Presented Simultaneously xh. A Auditory Regressor xh. B Visual Regressor Singular! Does not work!

XTX for Partially Co-linear Design Working Memory: Encode and Probe xh. A xh. B

XTX for Partially Co-linear Design Working Memory: Encode and Probe xh. A xh. B Encode Regressor Probe Regressor This design requires a lot of overlap which reduces the determinant, but this does not mean that it is a bad design. Note: textbook (HSM) suggests orthogonalzing. This is not a good idea.

Co-linearity and Invertibility What causes co-linearity? • Synchronized presentations/responses • Many regressors – Derivatives

Co-linearity and Invertibility What causes co-linearity? • Synchronized presentations/responses • Many regressors – Derivatives – Basis sets (eg, FIR) – Lots of nuisance regressors • Noise tends to be low-frequency • Task tends to be low-frequency • Only depends on X Note: more stimulus presentations generally improves things

Optimal Rapid Event-related design xh. A • • • xh. B Periodic Design (N=3)

Optimal Rapid Event-related design xh. A • • • xh. B Periodic Design (N=3) Jittered Design (N=21) Lots of overlap in jittered design Should push apart so no overlap? No overlap means fewer stimuli (fixed scanning time) How to balance? How to schedule?

Optimal Rapid Event-related design • Efficiency x only depends on X and C •

Optimal Rapid Event-related design • Efficiency x only depends on X and C • X depends on stimulus onset times and number of presentations • Choose stimulus onset times to max x • Can be done before collecting data! • Can interpret x as a variance reduction factor

Hemodynamic Response Function Model Error

Hemodynamic Response Function Model Error

HRF Model Error • Systematic deviation between the assumed shape and the actual shape

HRF Model Error • Systematic deviation between the assumed shape and the actual shape • Sources – – Delay, Dispersion Form (undershoot) Duration (stimulus vs neural) Non-linearity

HRF Delay Error y xh • Delay error of 1 sec • Loss of

HRF Delay Error y xh • Delay error of 1 sec • Loss of amplitude/slope (Bias), smaller t-values • Larger “noise” – ie, residual error. – Not fixed by more acquisitions • Scaling still preserved

HRF Bias vs Duration and Delay Error • As stimulus gets longer, bias gets

HRF Bias vs Duration and Delay Error • As stimulus gets longer, bias gets less

HRF Bias vs Duration and Shape Error • HSM Fig 9. 10 • As

HRF Bias vs Duration and Shape Error • HSM Fig 9. 10 • As stimulus gets longer, overall shape gets similar even if individual shapes are very different

Does Model Error Matter? • More False Negatives (Type II Errors) – Loss of

Does Model Error Matter? • More False Negatives (Type II Errors) – Loss of amplitude – More noise (usually trivial compared to other sources) • Scaling still preserved • Systematic Errors across … – Subjects, Brain Areas, Time …

When Does Model Error Matter? Sample 1 Sample 2 Actual Estimation Difference • •

When Does Model Error Matter? Sample 1 Sample 2 Actual Estimation Difference • • Samples have same true amplitude but different delays Estimated amplitudes are systematically different False Positives (or False Negatives) Samples could be from different: populations (normals vs Schizophrenics), times (longitudinal), brain regions

Adding Derivatives

Adding Derivatives

Still have a Problem • Fit to observed waveform is better (residual variance less)

Still have a Problem • Fit to observed waveform is better (residual variance less) • But now have two Betas • xh and derivative are orthgonal, so bm does not change (bias is not removed). • No good solution to the problem

Finite Impulse Response (FIR) Model

Finite Impulse Response (FIR) Model

Finite Impulse Response (FIR) Model y = X*b First Stimulus Onset + 1 TR

Finite Impulse Response (FIR) Model y = X*b First Stimulus Onset + 1 TR First Stimulus Onset + 2 TR Second Stimulus Onset Second Stimulus + 1 TR 10000 01000 00100 00010 00001 00000 b 1 b 2 b 3 b 4 b 5 Average at Stimulus Onset Average Delayed by 1 TR Average Delayed by 2 TR Average Delayed by 3 TR Average Delayed by 4 TR