Temporal Basis Functions Correlated Regressors Gary Price Patti

Temporal Basis Functions & Correlated Regressors Gary Price & Patti Adank f. MRI for Dummies 29 -03 -06

Correlated Regressors (or: the trouble with multicollinearity…) by (a slightly puzzled) Patti Adank f. MRI for Dummies 29 -03 -06

Sources: • Will Penny • Rik Henson’s slides www. mrccbu. cam. ac. uk/Imaging/Common/rik. SPM-GLM. ppt • previous years’ presenters’ slides f. MRI for Dummies 29 -03 -06

Correlations between regressors § in multiple regression analysis: § problems for behavioural data § behavioural example (fictional) § solutions § in the General Linear Model: § problems for neuroimaging data § PET example § solutions? f. MRI for Dummies 29 -03 -06

Multiple Regression Analysis & Correlated Regressors f. MRI for Dummies 29 -03 -06

Multiple regression analysis § Multiple regression characterises the relationship between several independent variables (or regressors), X 1, X 2, X 3 etc, and a single dependent variable, Y: Y = β 1 X 1 + β 2 X 2 +…. . + βLXL + ε § The X variables are combined linearly and each has its own regression coefficient β (weight) § βs reflect the independent contribution of each regressor, X, to the value of the dependent variable, Y § i. e. the proportion of the variance in Y accounted for by each regressor after all other regressors are accounted for f. MRI for Dummies 29 -03 -06

Multiple regression analysis Fit straight line through points for Y and X § some statistics: if the model fits the data well: - R 2 is high (reflects the proportion of variance in Y explained by the regressor X) - the corresponding p value will be low f. MRI for Dummies 29 -03 -06

Multiple regression analysis: multicollinearity § multiple regression results are sometimes difficult to interpret: § the overall p value of a fitted model is very low, § but individual p values for the regressors are high § this means that the model fits the data well, even though none of the X variables has a significant impact on predicting Y. § How is this possible? § caused when two (or more) regressors are highly correlated: problem known as multicollinearity f. MRI for Dummies 29 -03 -06

Regression analysis: multicollinearity example § When is multicollinearity between regressors a problem: § no: when you just want to predict Y from X 1 and X 2, the values of R 2 and p will be correct § yes: but when you want assess how individual regressors impact the independent variable: - individual p values can be misleading: a p value can be high, even though the variable is important); - the confidence intervals on the regression coefficients are very wide and may include zero: you cannot be confident whether an increase in the X value is associated with an increase, or a decrease, in Y. f. MRI for Dummies 29 -03 -06

Regression analysis: multicollinearity example § Measures for multicollinearity: In general: if r > 0. 8 between regressors it can be expected that they show multicollinearity In SPSS: Tolerance: proportion of a regressor’s variance not accounted for by other regressors in the model low tolerance values are an indicator of multicollinearity Variance Inflation Factor (VIF) the reciprocal of the tolerance large VIF values are an indicator of multicollinearity f. MRI for Dummies 29 -03 -06

Regression analysis: multicollinearity example § Example: § Question: how can the perceived clarity of a auditory stimulus be predicted from the loudness and frequency of that stimulus? § perception experiment in which subjects had to judge the clarity of an auditory stimulus. § model to be fit: Y = β 1 X 1 + β 2 X 2 + ε Y = judged clarity of stimulus X 1 = loudness X 2 = frequency f. MRI for Dummies 29 -03 -06

Regression analysis: multicollinearity example § What happens when X 1 (pitch) and X 2 (loudness) are collinear, i. e. , strongly correlated? § Correlation loudness & frequency : 0. 945 (p<0. 000) § high loudness values correspond to high frequency values frequency f. MRI for Dummies 29 -03 -06

Regression analysis: multicollinearity example § Contribution of individual predictors: § X 1 (loudness) is entered as sole predictor: Y = 0. 859 X 1 + 24. 41 R 2 = 0. 74 (74% explained variance in Y) p < 0. 000 § X 2 (frequency) entered as sole predictor: Y = 0. 824 X 1 + 26. 94 R 2 = 0. 68 (68% explained variance in Y) p < 0. 000 f. MRI for Dummies 29 -03 -06

Regression analysis: multicollinearity example § Collinear regressors X 1 and X 2 entered together: Resulting model: Y = 0. 756 X 1 + 26. 94 (X 2? ) R 2 = 0. 74 (74% explained variance in Y) p < 0. 000 Individual regressors: X 1 (loudness): R 2 = , p < 0. 000 X 2 (frequency): R 2 = 0. 555, p < 0. 594 f. MRI for Dummies 29 -03 -06

Regression analysis: removing multicollinearity § How to deal with collinearity: 1. Increase the sample size (no data like more data) 2. Orthogonalise the correlated regressor variables - using factor analysis - this will produce linearly independent regressors and corresponding factor scores. - these factor scores can subsequently be used instead of the original correlated regressor values f. MRI for Dummies 29 -03 -06

General Linear Model & Correlated Regressors f. MRI for Dummies 29 -03 -06

General Linear Model • the General Linear Model can be seen as an extension of multiple regression (or multiple regression is just a simple form of the General Linear Model): – Multiple Regression only looks at ONE Y variable – GLM allows you to analyse several Y variables in a linear combination (time series in voxel) – ANOVA, t-test, F-test, etc. are also forms of the GLM f. MRI for Dummies 29 -03 -06

General Linear Model and f. MRI Y Observed data Y is the BOLD signal at various time points at a single voxel = X . β + Design matrix Parameters Several components which explain the observed data, i. e. the BOLD time series for the voxel Timing info: onset vectors, Omj, and duration vectors, Dmj HRF, hm, describes shape of the expected BOLD response over time Other regressors, e. g. realignment parameters Experimental manipulations Define the contribution of each component of the design matrix to the value of Y f. MRI for Dummies 29 -03 -06 ε Error/residu al Difference between the observed data, Y, and that predicted by the model, Xβ�.

f. MRI: constructing the design matrix § In analysing f. MRI data, the problem of multicollinearity occurs when specifying regressors in the design matrix § If the regressors are linearly dependent (correlated) then the results of the GLM are not easy to interpret § because variance attributable to an individual regressor may be confounded with other regressor(s) § this may lead to misinterpretations of activations in certain brain areas f. MRI for Dummies 29 -03 -06

f. MRI: an example § for example: - suppose that a response to a stimulus Sr is highly correlated with the associated motor response Mr; - and suppose it is hypothesised that a specific region’s activity for Sr is not influenced by Mr; - then this region should be tested only after removing all the variance from the regressor for Mr all variance that can be explained for by Sr; - dangerous: as the motor response does influence the signal in the region; the test signal will be overly significant! -> variance is wrongly assigned to Sr f. MRI for Dummies 29 -03 -06

f. MRI: PET example § Andrade et al. , (1999) Ambiguous Results in Functional Neuroimaging Data Analysis Due to Covariate Correlation, Neuro. Image 10, 483 -486 § Andrade at al. show correlated regressors can lead to misinterpretations § collected PET data from a single subject and generated a covariate (regressor) variable that correlated strongly with the activation conditions used in the experiment (0 for rest, 1 -6 increasing linearly with activation levels in the experiment) f. MRI for Dummies 29 -03 -06

f. MRI: PET example § two purposes: 1. detect areas where the signal correlated with the generated covariate 2. search for differences in activation versus control periods § Implies fitting two models: § One with activation-vs-rest plus covariate regressors (r = 0. 845): M = C 1 (ac-rest) + C 2 (covariate) § One with variance from covariate C 2 removed: M* = C 1 + C 2* (C 2* = 0. 845 • √(SSC 2/SSC 1) f. MRI for Dummies 29 -03 -06

f. MRI: PET example § For model M and M* § SPM processing § parameters for C 1 and C 2/C 2* were tested using t-tests and transformed into z-scores § Results: § differences between M and M* occurred only for activation related to C 1 (the rest/activation regressor) § e. g. , parahippocampal activation significant in M but not in M* § left precuneal, superior temporal, medial frontal activity significant in M* but not in M f. MRI for Dummies 29 -03 -06

f. MRI: PET example § Example voxels: (54, -56, 34) activated in M (p = 0. 004) not in M* (p = 0. 901) (6, 28, -28) activated in M* (p = 0. 014) not in M (p = 0. 337) f. MRI for Dummies 29 -03 -06

f. MRI: dealing with multicollinearity § Andrade et al. suggest a technique using the F-statistic to orthogonalise correlated regressors without having to reestimate the β parameters (which can be very timeconsuming) using principles from linear model theory (Christensen, 1996) § Other technique used to remove correlations from regressors: Gram-Schmidt orthogonalisation (cf. Rik Henson’s slides) Christensen, 1996, Plane answers to Complex Questions: Theory of Linear Models, Springer. Verlag, Berlin f. MRI for Dummies 29 -03 -06

Dealing with multicollinearity in SPM § Use toolbox “Design Magic” - Multicollinearity assessment for f. MRI for SPM 99 (SPM 5? ) § Author: Matthijs Vink § URL: http: //www. matthijs-vink. com/tools. html § Allows you to assess the multicollinearity in your f. MRIdesign by calculating the amount of factor variance that is also accounted for by the other factors in the design (expressed in R 2). § also allows you to reduce correlations between regressors through use of high-pass filters f. MRI for Dummies 29 -03 -06

Conclusion § When fitting a model in multiple regression analysis or constructing your design matrix, correlations between regressors can lead to misinterpretations of the influence of the independent variables on the dependent variable § Multicollinearity is a hassle, but can be dealt with, usually though orthogonalisation procedures involving (groups of) regressors f. MRI for Dummies 29 -03 -06

Assessing multicollinearity in SPM § The end f. MRI for Dummies 29 -03 -06