Applying multivariate techniques to high-dimensional temporally correlated fMRI data

Share Embed


Descrição do Produto

Journal of Statistical Planning and Inference 141 (2011) 3760–3770

Contents lists available at ScienceDirect

Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi

Applying multivariate techniques to high-dimensional temporally correlated fMRI data Daniela Adolf a,, Sebastian Baecke a, Waltraud Kahle b, Johannes Bernarding a, Siegfried Kropf a a b

Institute for Biometry and Medical Informatics, Otto-von-Guericke University Magdeburg, Leipziger Str. 44, 39120 Magdeburg, Germany Institute for Mathematical Stochastics, Otto-von-Guericke University Magdeburg, Universit¨ atsplatz 2, 39106 Magdeburg, Germany

a r t i c l e in f o

abstract

Article history: Received 8 February 2010 Received in revised form 9 June 2011 Accepted 15 June 2011 Available online 25 June 2011

In first-level analyses of functional magnetic resonance imaging data, adjustments for temporal correlation as a Satterthwaite approximation or a prewhitening method are usually implemented in the univariate model to keep the nominal test level. In doing so, the temporal correlation structure of the data is estimated, assuming an autoregressive process of order one. We show that this is applicable in multivariate approaches too – more precisely in the so-called stabilized multivariate test statistics. Furthermore, we propose a blockwise permutation method including a random shift that renders an approximation of the temporal correlation structure unnecessary but also approximately keeps the nominal test level in spite of the dependence of sample elements. Although the intentions are different, a comparison of the multivariate methods with the multiple ones shows that the global approach may achieve advantages if applied to suitable regions of interest. This is illustrated using an example from fMRI studies. & 2011 Elsevier B.V. All rights reserved.

Keywords: High-dimensional data Autoregressive process Stabilized multivariate tests Block-wise permutation including a random shift

1. Introduction In many applications, high-dimensional records occur where the number of variables may even exceed the number of measurements. This also involves the so-called first-level analyses in functional magnetic resonance imaging (fMRI). These data reflect a proband’s brain activity per voxel, which is a three-dimensional volume element in the brain (Frackowiak et al., 2003). In fMRI analyses, the spatial resolution is often at least about 64  64 over 32 slices resulting in 131,072 voxels. The brain activity is measured voxel-wise by means of the so-called BOLD (blood oxygen level dependent) signal response, indicated by the local oxygen level change in an activated brain region. This leads to a temporal series of measurements per voxel. As usual in the standard software as SPM (2009), we consider the voxels as variables and the scans as sample elements. Depending on the paradigm, successive scans are made in intervals of 2 s which finally results in about a few hundred measurements. This way, the number n of measurements may be much smaller than the number p of available variables. Often general linear models are applied to estimate the influence of stimuli on the BOLD signal response and therefore to detect activation. But a conventional multivariate test statistic cannot manage this relation of sample size to number of

 Corresponding author. Tel.: þ49 3916713548; fax: þ 49 3916713536.

E-mail addresses: [email protected] (D. Adolf), [email protected] (S. Baecke), [email protected] (W. Kahle), [email protected] (J. Bernarding), [email protected] (S. Kropf). 0378-3758/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2011.06.012

D. Adolf et al. / Journal of Statistical Planning and Inference 141 (2011) 3760–3770

3761

variables. Furthermore – as in the univariate case – one has to take into account that the sample elements are dependent, which goes beyond the scope of classical statistical analyses. Usually a univariate general linear model is designed for every voxel separately, where all voxels share the same design matrix characterizing the stimulus presentation along with movement corrections and other possible confounders. Only the corresponding parameter values may change between voxels. Then contrasts can be tested using an adapted t test statistic. This results into a large number of p-values, that are either assessed without adjustment by using a very low threshold (e.g. 0.001) or via familywise error (FWE) and false discovery rate (FDR) controlling methods. There have also been multivariate methods to analyze all voxels of a region of interest (ROI) as a whole. Kriegeskorte et al. (2006) propose the so-called searchlight method. They regard signals from neighboring voxels and compute the Mahalanobis distance, centering one voxel at any one time. Supplementary to the univariate approach, a region’s covariance structure is included in such multivariate models. Furthermore, Hassabis et al. (2009) use this searchlight approach to implement a multivariate pattern classification for decoding neuronal ensembles. We will discuss here another approach of applying a multivariate model on an ROI, using the so-called stabilized ¨ multivariate test statistics as proposed by Lauter et al. (1996, 1998). This class of tests is designed to cope with highdimensional data utilizing redundance between variables for smoothing and thus increasing the power of the tests. These tests can handle arbitrary correlation structures between the variables. Usually, they are based on the assumption of independent sample vectors, but some special work for balanced multivariate mixed models is done by Glimm (2000). It is the aim of this paper to investigate an adaption of these tests originally developed for independent sample vectors for autocorrelated observation vectors and to apply them to data from an fMRI study. In Section 2 we first give a short review of commonly used uni- and multivariate analyses, especially for highdimensional data explaining briefly the idea of stabilized multivariate tests. Section 3 covers the situation of dependence between sample elements and the univariate adjustment strategies to this kind of correlation in functional magnetic resonance imaging. In Section 4, we propose an adaption of these strategies for multivariate tests and a corrected correlation estimation. Furthermore, we utilize an adjusted permutation method when taking the dependence of sample elements into account and propose a random shift to improve this method. We use simulated data to check the type I error for the different methods in Section 5. Finally, we apply the adjusted methods on experimental functional magnetic resonance imaging data. A discussion completes the paper. 2. Statistical model and solutions for independent sample elements For brain activation analyses, a general linear model (GLM) is assumed as Y np ¼ X ns Bsp þ Enp ,

E  Nnp ð0,V nn  Rpp Þ,

ð1Þ

where Y is the observation vector, containing the BOLD signal response from p voxels in n scans. X is the design matrix with s linearly independent columns, B the parameter matrix and E contains the residuals, characterized by their spatial (R) and temporal (V) covariance structures. But we assume a constant variance over time. Then this constant can be factored out and be subsumed into the spatial covariance matrix. This way, V equates to a correlation matrix. Using standard ordinary least squares theory, the unknown parameter matrix B is estimated by B^ ¼ ðX0 XÞ1 X0 Y:

ð2Þ

Hypotheses to be tested are related to contrasts of the parameter matrix B H0 : c 0 B ¼ 0, where we restrict here to hypotheses of dimension 1 as in the one- or two-sample problem or in regression analysis (Fahrmeir et al., 2003). We begin by considering the simple case V ¼ I n and regard the univariate and classical as well as stabilized multivariate analyses procedure for this kind of model. 2.1. Univariate analyses Focussing first on a single voxel, (1) reduces to the univariate model with p¼1 and a residual vector following a multivariate normal distribution e  Nð0, s2 VÞ. In the classical linear methods theory, e follows a multivariate normal distribution with expectation 0 and covariance matrix s2 I n so that the sample elements are independent. Then the null hypothesis H0 : c 0 b ¼ 0 can be tested by using the conventional t test statistic c0 b^ t ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi s^ 2 c0 ðX 0 XÞ1 c 2 with s^ ¼ ½ðyX bÞ0 ðyX bÞ=ðnsÞ and ns degrees of freedom.

ð3Þ

3762

D. Adolf et al. / Journal of Statistical Planning and Inference 141 (2011) 3760–3770

2.2. Multivariate analyses Reassuming the multivariate model (1), we observe p voxels and their series of scans in parallel. For that purpose, we first consider the conventional multivariate case which requires that the number p of variables is smaller than the number n of sample elements. Then we turn to the stabilized methods, where the number of variables may exceed the sample size. 2.2.1. Conventional multivariate analyses In the conventional multivariate case, classical multivariate statistics can be applied, e.g. Wilks’ L. It is built by the ¨ determinants of the sums of squares and cross products matrices G and H (Ahrens and Lauter, 1981; Anderson, 1984):



jGj , jH þ Gj

ð4Þ

where 0

H ¼ B^ cðc0 ðX 0 XÞ1 cÞ1 c0 cB^ ¼ Y 0 Q H Y

ð5Þ

with Q H ¼ XðX 0 XÞ1 cðc 0 ðX 0 XÞ1 cÞ1 c 0 ðX 0 XÞ1 X 0 describes the derivation from the null hypothesis and ^ ¼ Y 0Q Y ^ 0 ðYX BÞ G ¼ ðYX BÞ G 0

1

ð6Þ

0

with Q G ¼ I n XðX XÞ X characterizes the residuals. The null hypothesis is rejected for L o Up,m,df , a where Up,m,df , a is a quantile of Wilks’ L distribution with m¼1 being the rank of the contrast matrix, df ¼ ns the remaining degrees of freedom for the residuals and a being the level of significance. There are special cases (including the case m¼1 given here) where L can be transformed into an F-distributed test statistic, in the other cases one has good F approximations (Pillai and Gupta, 1969). 2.2.2. Stabilized multivariate tests Because the number of variables may exceed the sample size in fMRI data – sometimes even when considering just a region of interest in the brain – conventional multivariate statistics exceed their limits because the matrices H and G according to (5) and (6) become singular. But one can apply the so-called stabilized multivariate methods proposed by ¨ Lauter et al. (1996, 1998). These are based on univariate or low-dimensional (q, 1r q ominðp,nsÞÞ linear scores with data dependent coefficients ensuring a left-spherical score distribution under the null hypothesis, which makes the method ¨ working exactly. The scores are then analyzed instead of the original data (Lauter et al., 1998) via classical multivariate test statistics as Wilks’ L (cf. Section 2.2.1). More in detail, the n  q-dimensional scores Z are generated by multiplying the observation matrix with a p  q-dimensional weight matrix D Z ¼ YD,

ð7Þ

where D may be any function of the total sums of squares and cross products matrix W W ¼ H þG with H and G from (5) and (6). Furthermore it has to be ensured that Z is of rank q with probability 1. In the simplest case, a one-dimensional score is computed by means of a weight vector d ¼ DiagðWÞ1=2 1p with 1p being a p-dimensional vector consisting of ones. The score can now be analyzed via the standard univariate t test statistic. This approach is known as standardized sum test (SS test), because the effects of the scores are summed up after being scaled. For this reason, the test is useful only if one can assume equally directed effects of similar relative size in the variables. An alternative is the principal component test (PC test) based on a q-dimensional score. Here, the weight matrix D is computed by means of the eigenvalue problem of the total sums of squares and cross products matrix W WD ¼ DiagðWÞDK

with D0 DiagðWÞD ¼ I q ,

where K contains the q largest eigenvalues of W on the diagonal and D includes the corresponding eigenvectors. In the SS test and in the above version of the PC test, different scales are eliminated. When approximately equal scales are expected, then a scale dependent version of the PC test is given by WD ¼ DK

with D0 D ¼ I q :

In both cases, the score matrix generated according to (7) can then be tested via a conventional multivariate test statistic as Wilks’ L (4) using the dimension q instead of p. The score dimension q can be fixed in advance or can be derived from the data as function of W.

D. Adolf et al. / Journal of Statistical Planning and Inference 141 (2011) 3760–3770

3763

2.3. Permutation tests The above uni- and multivariate test statistics are developed as parametric tests for (multivariate) normal data. Of course, they can also be used in permutation tests where the parametric assumptions are replaced by assumptions regarding the interchangeability of sample elements under the null hypothesis (Kropf, 2000). In the multivariate case, it is important that sample vectors are permuted as a whole. Alternatively, we do not permute the observations but the design matrix. Particularly, this is possible, if the contrast to be tested is related to only one column of the design matrix. Then this single column is permuted. If the contrast c concerns several columns of X, we transform the design matrix into another vector basis, using the normalized vector c~ 0 as first basis vector and s1 orthogonal vectors c~ j , ðj ¼ 1, . . . ,s1Þ as complementary basis vectors. Then the first column X c~ 00 of the transformed design matrix is permuted whereas the other columns X c~ 0j are kept fixed. This way, all possible null hypotheses of rank 1 can be tested via permutation. Extensions to other test dimensions are straightforward. 3. Univariate correction strategies for correlated sample elements Considering the univariate general linear model ((1) with p ¼ 1), we now turn to the sample element dependence (VaI n ), which is modeled as a first order autoregressive process by 0

1

r

r2

...

B r B B 2 V ¼B B r B ^ @

1

&

&

&

&

&

&

&

&

...

r2

r

rn1

rn1

1

^ C C C r2 C C r C A 1

ð8Þ

with r being the (usually unknown) autocorrelation coefficient. This means that an observation yt is directly influenced by its temporal preceding element yt1 associated with noise at time t. The sample element dependence must not be ignored, because the higher the correlation, the more the error of first kind may exceed its limit. This is visible with regard to a simulated common univariate two-sample test problem for mean differences under the null hypothesis using a standard t test (n1 ¼ n2 ¼ 200, a ¼ 0:05, 50,000 replications) and ignoring the embedded temporal autocorrelation. According to fMRI analyses, here only non-negative values of the temporal correlation coefficient are of interest. So an increasing correlation coefficient results into an increasing type I error in this setting. For r ¼ 0:4 already every fifth true null hypothesis is rejected. And for higher correlation coefficients, up to more than half the decisions are false positive. 3.1. Satterthwaite approximation To deal with autocorrelation, one approach in SPM (a MATLAB software package for neuroimaging data; SPM (2009)) is to adapt the test for dependence of sample elements by a Satterthwaite approximation (Frackowiak et al., 2003). The OLS estimate b^ of the parameter vector b does not change with a correlation matrix VaI n although it is no longer the Gauss–Markov estimate anymore. But the covariance matrix of the vector of parameter estimates changes to

Rb^ ¼ s2 ðX 0 XÞ1 X 0 VXðX 0 XÞ1 ¼ s2 X  VX 0 with X  ¼ ðX 0 XÞ1 X 0 being the Moore–Penrose inverse of X. Therefore, the test statistic (3) is adapted according to c0 b^ t ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  tn , s^ 2 c 0 ðX  VX 0 Þc

ð9Þ

2 where s^ is given by

s^ 2 ¼

y0 Ry traceðRVÞ

with R ¼ IXX  being the residual forming matrix. The denominator of (9) is not exactly w2 -distributed anymore. That is why a Satterthwaite approximation for the degrees of freedom is derived yielding a w2 approximation with first and second moments equal to those of the true distribution of the denominator. According to that, the degrees of freedom of the t distribution n in (9) is given by



ðtraceRVÞ2 : traceððRVÞðRVÞÞ

ð10Þ

3764

D. Adolf et al. / Journal of Statistical Planning and Inference 141 (2011) 3760–3770

3.2. Prewhitening Another approach to deal with sample element dependence is to use a prewhitening step. That means that the temporal correlation is eliminated before proceeding with the standard test. Based on the distribution of the residuals e  Nð0, s2 VÞ and assuming a known temporal correlation V, this can be achieved by multiplying all parts of the univariate general linear model with the inverse of a square root of the correlation matrix y ¼ V 1=2 y ¼ V 1=2 X b þ V 1=2 e ¼ : X b þ e : %

%

%

Thus a model with standard distribution e  Nð0, s2 I n Þ is obtained on which the classical t test statistic (3) can be applied using X and y instead of X and y. If V is unknown, then an estimate is used instead and the test becomes approximate. %

%

%

4. Corresponding adjustment of multivariate tests 4.1. Satterthwaite approximation In the Satterthwaite-approximated univariate test statistic, used in SPM, the temporal correlation is taken into account regarding the hypothesis related part of the statistic (cf. Section 3.1). We adapt this Satterthwaite approximation correspondingly in the multivariate case. Therefore, we include the correlation matrix V in the hypothesis related sums of squares and cross products matrix H ¼ Y 0Q H Y %

%

0

with Q H ¼ XðX XÞ1 cðc0 ðX 0 XÞ1 X 0 VXðX 0 XÞ1 cÞ1 c 0 ðX 0 XÞ1 X 0 . The total sums of squares and cross products matrix W is then computed via W ¼ H þ G. Using a multivariate test statistic as Wilks’ L with the modified matrix W instead of W, the degrees of freedom ns are substituted by n according to (10). Analogous to the univariate case, the distribution of G is not exactly Wishart distributed then but is approximated. %

%

%

%

%

4.2. Prewhitening As a matter of course, prewhitening methods with a known matrix V yield an exact test statistic in the multivariate case as well because only the dependence of measurements is prewhitened whereas an eventually dependence of variables remains untouched. Via prewhitening a standard multivariate normal distribution is obtained (cf. Section 3.2), which a classical or stabilized multivariate test, respectively, can be based on (cf. Section 2.2). For an unknown correlation matrix V, again the estimation problem occurs and the test becomes approximate. 4.3. Correlation estimates In practice, it is necessary to estimate the temporal correlation of the data for both adaptation techniques considered before. This is done by assuming an autoregressive process of first order and just estimating the correlation coefficient r in (8), e.g. based on the Yule Walker equation (Schlittgen and Streitberg, 2001). Here the autocorrelation of a series y ¼ ðy1 , . . . ,yn Þ0 is estimated by regarding the correlation between the series ðy1 , . . . ,yn1 Þ0 and itself shifted by a lag of one ðy2 , . . . ,yn Þ0 . However, we use the transformed vectors y~ ¼ Ry in order to remove the expectations according to the linear model. Assuming an equal temporal correlation behavior in all variables, the estimate is averaged over them in order to yield stable results. Simulation studies (p ¼100, 10,000 replications) show that for small samples, the mean correlation estimate is quite far from the real value, e.g. n ¼200, r ¼ 0:2, r^ ¼ 0:184, root mean squared error RMSEr^ ¼ 0:017. Of course, this can be improved by increasing the sample size. But actually, the estimation of the correlation coefficient is biased. The expected value of an autoregressive process given by Schlittgen and Streitberg (2001) can be simplified to the expected value for the coefficient of a first order autoregressive process (cf. Appendix) 1 Eðr^ Þ ¼ r ½1þ 4r þ Oðn2 Þ: n Substituting Eðr^ Þ with r^ and omitting Oðn2 Þ yields the expectation corrected estimation

r^ corr: ¼

nr^ þ 1 : n4

ð11Þ

Our simulation studies show that – although the estimates are better than the uncorrected ones in terms of the mean value and the RMSE can be reduced to almost the half (n ¼200, r ¼ 0:2, r^ ¼ 0:193, RMSEr^ ¼ 0:010Þ – this also works well for quite large samples only. In the subsequent simulation studies we will always use the corrected estimates.

D. Adolf et al. / Journal of Statistical Planning and Inference 141 (2011) 3760–3770

3765

4.4. Adjustment of permutation tests Using classical permutation methods, sample elements are assumed to be independent. Correlations between sample vectors interfere with the interchangeability under the null hypothesis and this may lead to an excess of the test level. Using an adjusted permutation test to ensure the test level (instead of an adjusted parametric test statistic), the results are no longer dependent on a proper estimation of the correlation coefficient. 4.4.1. Block-wise permutation tests An approach to take the sample element correlation into account is to permute blocks of neighboring elements instead of single values (or vectors in the multivariate case, respectively). Blocks of adjacent elements (or vectors) are considered as single elements and permuted as a whole. This way, under the null hypothesis, the blocks must be (at least approximately) exchangeable and the number of possible permutations is now restricted by the number of blocks k. Building such blocks means sectioning an n-dimensional vector (or matrix) of measurements into sequent blocks of length l (1r l rn=2Þ. If the block length l is no divisor of the sample size, n mod l ¼ r, (0 or o lÞ, then the last block contains l þr elements. In the minimum case l¼ 1, there are k¼n blocks of one single element each, which corresponds to the classical element-wise permutation principle. In the maximum case l ¼ n=2, there are only k¼2 blocks to permute. In order to get approximately independent blocks, a quiet large block length l has to be chosen. Because an element correlates with the corresponding element in a neighboring block with correlation coefficient rl . And even for high correlations such as 0.8, rl r 0:0001 is held by a block length of at least l ¼42. Nevertheless, neighboring elements belonging to different blocks are still highly correlated. But the longer the blocks, the more this correlation between blocks gets marginal compared to the correlations within a block. But contrarily there still have to be enough blocks for reaching a sufficiently detailed permutation distribution. In fMRI data these two requirements are hardly compatible because of the relatively small number of measurements. 4.4.2. Block-wise permutation tests including a random shift We propose a method to increase the number of possible permutations in spite of a large block length. We use a socalled random shift before the block permutation itself. In each block permutation step a random number a 2 Z (0 r a rn1Þ of sample vectors is removed from the top of the matrix to be permuted and added on its bottom. Then the time series is divided into sequent blocks and permuted. At this, a is simulated using a uniform distribution. Thereby, the possible number of permutations is increased by factor l, if the block length l is a divisor of the sample size n. If l is no divisor of the sample size, then it is even increased by factor n. Furthermore the effect of unintentionally occurring synchronicity of the permutation design and the experimental paradigm is reduced using this random shift. 5. Simulation studies 5.1. Design of simulation studies For assessing the influence of sample element dependence on the unadapted as well as adapted tests under conditions close to real fMRI studies, we simulate data with fMRI characteristics including temporal and spatial correlations. A typical phenomenon of fMRI data is the delay of the neurovascular coupling, which is a complicated interaction between oxyhemoglobin, deoxyhemoglobin, blood flow and blood volume (Friston et al., 2000). This neurovascular coupling finally results in the so-called haemodynamic response function (HRF), which has to be taken into account creating the design matrix. Therefore, the onset vectors are convolved with the HRF. A conventional approximation for this function is a double gamma function  a1  a2 x x GHRF ðxÞ ¼ K1 eðxA1 Þ=b1 K2 eðxA2 Þ=b2 A1 A2 with

ai ¼ 8 lnð2Þ

A2i Wi2

and

bi ¼

Wi2 , 8 lnð2ÞAi

i ¼ f1,2g,

where Ai is the time to peak of the ith gamma function in seconds, Wi is the full width at half maximum or minimum of the ith gamma function in seconds and Ki is the scale parameter of the ith gamma function. One version of the HRF is shown in Fig. 1 where A1 ¼ 5:40, W1 ¼ 2:50, K1 ¼ 1:00, A2 ¼ 9:00, W2 ¼ 3:35 and K2 ¼ 0:25 (Hollmann et al., 2008). The temporal correlation of the observed BOLD signals is usually assumed to behave like an autoregressive process of first order. Thus when simulating data, we model the signal with the design matrix including the delay by the HRF and the noise as an AR(1) process.

3766

D. Adolf et al. / Journal of Statistical Planning and Inference 141 (2011) 3760–3770

1.0 0.8 0.6 0.4 0.2 0.0 -0.2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 time in seconds Fig. 1. Hemodynamic Response Function for the specified parameter values of a double gamma function.

Fig. 2. Simulated fMRI signal design before convolution.

Spatially, it is established that the brain consists of differentiable regions that are concerned with special functions, e.g. the motor cortex or the auditory cortex. In our simulations we arbitrarily assume three functional regions, that are independent of each other, containing a compound symmetry correlation structure inside. Therefore, we first simulate independent normally distributed data according to Nnp ð0,I n  I p Þ and then multiply them with the quadratic symmetric square root of the AR(1) correlation matrix V (8) from the left and with the square root of the spatial correlation matrix R, in our case 0 1 1 rs . . . rs 0 1 CS 0 0 B C 1 & ^ C B rs C C R¼B @ 0 CS 0 A with CS ¼ B B ^ & & r C, sA @ 0 0 CS rs . . . rs 1 from the right. We arbitrarily choose rs ¼ 0:5 and do simulations for r ¼ 0:0,0:1,0:2, . . . ,0:9 for temporal correlation. We simulate an fMRI block design adding a signal (cf. Fig. 2) to the baseline noise over arbitrarily chosen 500 voxels. We assume 20 runs, that contain a baseline and a stimulus period each. The stimulus period comprises six scans. The events are pseudorandomly separated (‘‘jittered’’) to avoid habituation. This means that the baseline period consists of 12 and additionally one to four scans randomly, which sums up to the total number of 420 scans (sample elements) in this setting. This composition is quite conventional in real fMRI studies. We convolve this design with the HRF before using the information in the design matrix. To analyze the empirical type I error, we simulate under the null hypothesis using 10,000 replications, a nominal test level of a ¼ 0:05 and test for a significant impact of the stimulus on the outcome. Due to time required, for permutation tests we only use 5000 replications with each 249 random permutations. According to the simulated spatial correlation structure, we choose a three-dimensional PC test. For univariate analyses, a t test statistic is calculated for each of the 500 variables. All simulations are made in SAS/IML 9.2 (SAS, 2008). 5.2. Simulation results for type I error In simulations for correlated data (r 40Þ, the empirical type I error of the unadjusted tests increases up to (on average) 22% using a t test statistic and 38% for three-dimensional PC tests. An element-wise permutation on correlated observations also leads to type I error violations of this dimension.

D. Adolf et al. / Journal of Statistical Planning and Inference 141 (2011) 3760–3770

3767

Table 1 Rejection rates for t tests, scale dependent, scale independent PC tests and SS tests with different parametric adjustment strategies depending on the parameter value (the corrected estimation of the correlation coefficients is used; n¼420 sample elements, p ¼ 500 variables, 10,000 replications; for t tests the minimal as well as the mean p-value are shown). Adjustment

Test statistic

r ¼ 0:0

r ¼ 0:4

r ¼ 0:8

Satterthwaite

t test (min; mean) PC test (scale dep.) PC test (scale indep.) SS test

0.046; 0.051 0.054 0.053 0.052

0.045; 0.051 0.014 0.014 0.022

0.044; 0.050 o 0:001 o 0:001 o 0:001

Prewhitening

t test (min; mean) PC test (scale dep.) PC test (scale indep.) SS test

0.045; 0.051 0.054 0.053 0.052

0.046; 0.051 0.052 0.052 0.051

0.044; 0.050 0.050 0.050 0.051

Table 2 Rejection rates for original and adjusted permutation tests based on test statistics of t test, scale dependent, scale independent PC test and SS test depending on the parameter value (n ¼420 sample elements, p ¼ 500 variables, 249 permutations, 5000 replications; for t tests the minimal as well as the mean p-value are shown). Adjustment

Test statistic

r ¼ 0:0

r ¼ 0:4

r ¼ 0:8

Block permutation Block length 5 84 blocks

t test (min; mean) PC test (scale dep.) PC test (scale indep.) SS test

0.040; 0.048 0.045 0.046 0.050

0.051; 0.061 0.082 0.082 0.070

0.036; 0.043 0.045 0.041 0.041

Block permutation Block length 5 84 blocks Incl. random shift

t test (min; mean) PC test (scale dep.) PC test (scale indep.) SS test

0.039; 0.048 0.046 0.047 0.047

0.050; 0.060 0.083 0.083 0.066

0.033; 0.041 0.036 0.036 0.037

Block permutation Block length 50 8 blocks

t test (min; mean) PC test (scale dep.) PC test (scale indep.) SS test

0.036; 0.046 0.049 0.049 0.048

0.038; 0.047 0.052 0.052 0.050

0.036; 0.044 0.043 0.043 0.043

Block permutation Block length 50 8 blocks Incl. random shift

t test (min; mean) PC test (scale dep.) PC test (scale indep.) SS test

0.038; 0.048 0.045 0.045 0.045

0.038; 0.048 0.053 0.053 0.050

0.035; 0.044 0.039 0.040 0.043

Block permutation Block length 100 4 blocks

t test (min; mean) PC test (scale dep.) PC test (scale indep.) SS test

0.021; 0.028 0.029 0.031 0.028

0.022; 0.028 0.030 0.032 0.029

0.020; 0.028 0.027 0.028 0.022

Block permutation Block length 100 4 blocks Incl. random shift

t test (min; mean) PC test (scale dep.) PC test (scale indep.) SS test

0.039; 0.048 0.045 0.048 0.049

0.038; 0.048 0.049 0.049 0.045

0.038; 0.045 0.043 0.043 0.042

Table 1 shows the empirical type I error of data simulated under the null hypothesis (according to the design described in Section 5.1) for uncorrelated, moderately and a highly temporal correlated data in the parametric versions. Table 2 includes the analogous empirical type I errors for the permutation tests. Due to the number of simulations (10,000 and 5000 replications respectively; cf. Section 5.1), the empirical type I error would be expected within [0.0457; 0.0543] and [0.0440; 0.0560], respectively, if the procedure really keeps its level (95% confidence interval). First of all, each uni- and multivariate method and their different parametric adjustments ensure the nominal test level of a ¼ 0:05 in case of independency (r ¼ 0:0Þ. The permutation tests are an exception insofar as they become conservative when the possible number of permutations is small due to a large block length and if the random shift is not used. But including the shift, these tests also come close to the nominal test level when there is no temporal correlation, even for very few blocks. In the univariate case, the Satterthwaite approximation as well as the prewhitening remove the level excess completely. But for multivariate analyses, the Satterthwaite approximation causes an extremely conservative behavior of the test statistics, which is intensified with increasing correlations. Despite the estimation problem for the parameter r, the prewhitening method ensures the nominal test level for the multivariate test statistics, too.

3768

D. Adolf et al. / Journal of Statistical Planning and Inference 141 (2011) 3760–3770

The block-wise permutation decreases the deviation from the nominal test level in element-wise permutations with increasing block length. With small blocks such deviations occur in both directions. But for block lengths of at least 50 the empirical type I error lies within the 95% confidence interval of the nominal test level. For much larger block lengths, the sole block permutation becomes conservative – due to the small number of possible permutation. However, this can be largely avoided by using the random shift.

6. Application to a depth-by-disparity study To demonstrate the application of our multivariate adjusted methods in comparison to the univariate ones, we use data of a depth-by-disparity study (Baecke et al., 2009). The aim of this study was to investigate the extension and localization of activation that is evoked by depth-by-disparity stimuli. The study was performed in 26 subjects on a 3 Tesla scanner using a 64  64 matrix over 32 slices, resulting in n¼ 250 scans (cf. Baecke et al., 2009, for technical details). The probands saw a random dot pattern with and without disparity via anaglyph techniques for 1 s each in a pseudo-random order, each followed by a baseline condition varying from 2 to 6 s. Almost all individuals reported a stable depth perception, measured as difference between the disparity and the simple random dot condition, which is attended by a motion-perception. For the demonstration here, we have chosen one proband, who achieved slightly worse results, to apply our multivariate methods in parallel to the univariate procedures. In these data, there are few unadjusted significant voxels found in the univariate analyses by the standard fMRI data analysis software SPM, but applying the implemented FWE adjustment (Bonferroni as well as Gaussion Random Fields), no effect can be observed. In the study, several regions were detected with different activation levels. We focus on a region in the right middle temporal gyrus (MTþ/V5) (cf. Fig. 3), which is highly motion-sensitive as well as sensitive to depth perception by disparity (Baecke et al., 2009). Centering at the area’s activation maximum, we analyze a 9  9  9 region of interest (729 voxels) for the main effect of depth perception. The estimated temporal correlation coefficient is r^ ¼ 0:168. All results are shown in Table 3. Having a look on the univariate results of the different adjustment methods, the Satterthwaite approximation yields a minimal p-value of 0.00014. Without adjustment for the familywise error, there are 63 (of 729) significant voxels (p r 0:05Þ. But using a Bonferroni correction, no significant voxels are left. Prewhitening yields similar results: 64 voxels show a significant effect of depth perception when omitting adjustment for multiple testing, but no voxels fall into this category via Bonferroni adjustment. Using block permutation, slightly less effects can be found in the univariate case. The results differ between 21 and 23 unadjusted significant voxels using block lengths between 40 and 80 scans including the random shift, but they also do not hold when assessed with FWE adjustment. Furthermore, the multivariate results are presented in Table 3. Here, we cannot determine the single significant voxels but provide evidence of significantly effected regions, ensuring the nominal test level, even in the case of a large number of variables (voxels).

Fig. 3. SPM: sagital (left), coronal (middle) and transversal (right) view of depth perception activations in the right middle temporal gyrus (MT þ/V5; p r 0:001 unadjusted).

Table 3 Results for depth perception using t test, SS test, scale dependent and scale independent PC test with different adjustment strategies (the corrected estimation of the correlation coefficient is used; n¼ 250 scans, p ¼729 voxels; 10,000 permutations; for t tests the minimal p-value is shown). Adjustment

t test unadjusted

Satterthwaite o 0:001 Prewhitening o 0:001 Block permutation incl. random shift Block length 40 0.003 Block length 60 0.006 Block length 80 0.007

t test FWE adjusted

SS test

PC test (scale dep.)

PC test (scale indep.)

0.103 0.100

0.187 0.122

0.011 0.013

0.291 0.053

1 1 1

0.257 0.237 0.271

0.001 0.002 0.005

0.046 0.049 0.046

D. Adolf et al. / Journal of Statistical Planning and Inference 141 (2011) 3760–3770

3769

The standardized sum test cannot account for depth perception in the chosen region of interest, neither in the parametric adjustments nor via block permutation including the random shift. Obviously, the majority of inefficient voxels dilutes the effects in the few effective voxels too much. The scale dependent PC test can detect significance within our region of interest in all adjustment methods. The use of block permutation including the random shift yields the smallest p-values of maximal 0.005 for all tested block lengths. However, the scale independent PC test cannot account for significant voxels in the Satterthwaite adjusted version. But it shows a tendency via prewhitening. This can be improved using block permutation including the random shift, that yields significant p-values. 7. Discussion We could show that adjustment strategies for temporal correlated fMRI data – as there are usually applied in the univariate linear model approach – can be used in stabilized multivariate tests in general. For better keeping the nominal test level, the prewhitening method should be preferred to the Satterthwaite approximation. An explanation for the poor performance of the Satterthwaite approximation could be the empirical observation that the temporal correlation of the raw data is different from that of the principal components derived from them and that even the different components have different correlations. Furthermore, the OLS estimate of the parameter vector is no longer the best linear unbiased estimate due to the temporal correlation. But in any case of these parametric adjustments, the temporal correlation structure of the data has to be approximated, i.e. assuming an autoregressive process of first order and estimating its correlation coefficient. Thus, there is always an approximation due to the data. Hence, we adapted a non-parametric approach, based on simple permutations and proposed a so-called random shift. Using a block-wise permutation including that random shift respects the temporal correlation of the sample elements and additionally ensures the nominal test level. But one has to pay attention to the chosen block length, which must be large enough to cope with the dependence of sample elements. Based on our simulations, for fMRI data, we suggest to choose a block length not smaller than 50 scans. In combination with the random shift, the nominal test level can then be ensured. In the example, we could show that results do not differ importantly using at least a block length of 40. The simulation results regarding the type I error control for the three stabilized multivariate tests included here were quite similar in the respective correlation adjustment method. For application, we chose fMRI data of one proband in a depth perception study, whose activation in a special region of interest (ROI) could not be detected when multiple adjustment was used. The standardized sum test was not sensitive enough to yield significance, which may be caused by the data structure in the ROI. The region affected by depth perception is only a small part of the chosen voxels. It turned out that the scale dependent principle component test had the highest power to detect significance in this region, namely for all adjustment strategies. It is superior to the scale independent version because functional MRI data in different regions and activation levels are of similar physical dimension. Thus the scale independence is not necessary here. In fact, multivariate tests on fMRI data cannot substitute a multiple analysis, but due to the higher power, activated regions can be found, which would otherwise have remained undetected.

Acknowledgements This work has been supported by the German Research Foundation (DFG, grant KR 2231/3-1). Appendix A. Derivation of the expectation correction for q The expected value of a first order autoregressive process given by Schlittgen and Streitberg (2001) is   1 f ð0Þ Eðr^ Þ ¼ r r þ ð1rÞ þ2z1 2rz0 þOðn2 Þ n g0 with

zc ¼

rjcj ð1 þ r2 þ jcjð1r2 ÞÞ, 1r2

so that

z0 ¼

1 þ r2 1r2

and

z1 ¼

2r : 1r2

The covariance function of an AR(1) process is

gt ¼ rjtj

s2e , 1r2

ð12Þ

3770

D. Adolf et al. / Journal of Statistical Planning and Inference 141 (2011) 3760–3770

where t represents the lag. For t ¼ 0 it is

g0 ¼

s2e : 1r2

The spectrum f(0), which is the Fourier transformation of the covariance function, is f ð0Þ ¼

1þr

s2e s2e ¼ : r cos 2p0 1 þ r2 2r

2 2

Thus, the expected value of an AR(1) correlation coefficient estimate (12) can be simplified to   1 1r2 2r 1 þ r2 1 Eðr^ Þ ¼ r r þ ð1 r Þ þ 2 2 r þ Oðn2 Þ ¼ r ½1 þ 4r þ Oðn2 Þ: n n 1 þ r2 2r 1r2 1r2

References ¨ Ahrens, H., Lauter, J., 1981. In: Multivariate Varianzanalyse second ed. Akademie-Verlag, Berlin. Anderson, T., 1984. In: An Introduction to Multivariate Statistical Analysis second ed. John Wiley & Sons, New York. ¨ ¨ Baecke, S., Lutzkendorf, R., Tempelmann, C., Muller, C., Adolf, D., Scholz, M., Bernarding, J., 2009. Event-related functional magnetic resonance imaging (efmri) of depth-by-disparity perception: additional evidence for right-hemispheric lateralization. Experimental Brain Research 196, 453–458. ¨ Fahrmeir, L., Kunstler, R., Pigeot, I., Tutz, G., 2003. In: Statistik fourth ed. Springer, Berlin Heidelberg. Frackowiak, R., Friston, K., Frith, C., Dolan, R., Price, C., Zeki, S., Ashburner, J., Penny, W., 2003. In: Human Brain Function second ed. Academic Press. Friston, K., Mechelli, A., Turner, R., Price, C., 2000. Nonlinear responses in fmri: the balloon model, volterra kernels, and other hemodynamics. NeuroImage 12, 466–477. Glimm, E., 2000. Spherical tests in balanced multivariate mixed models. Biometrical Journal 42 (8), 937–950. Hassabis, D., Chu, C., Rees, G., Weiskopf, N., Molyneux, P., Maguire, E., 2009. Decoding neuronal ensembles in the human hippocampus. Current Biology 19 (7), 546–554. ¨ Hollmann, M., Monch, T., Baecke, S., Luchtmann, M., Bernarding, J., 2008. New experimental setup using individual hemodynamic response functions used ¨ Medizinische Informatik to increase the statistical power in event-related real-time fmri (erfmri). In: 53. Jahrestagung der Deutschen Gesellschaft fur Biometrie und Epidemiologie (gmds). German Medical Science GMS Publishing House. Kriegeskorte, N., Goebel, R., Bandettini, P., 2006. Information-based functional brain mapping. PNAS 103 (10), 3863–3868. Kropf, S., 2000. Hochdimensionale multivariate Verfahren in der medizinischen Statistik. Shaker Verlag, Aachen. ¨ Lauter, J., Glimm, E., Kropf, S., 1996. New multivariate tests for data with an inherent structure. Biometrical Journal 38, 5–23. ¨ Lauter, J., Glimm, E., Kropf, S., 1998. Multivariate tests based on left-spherically distributed linear scores. Annals of Statistics 26, 1972–1988. Pillai, K., Gupta, A., 1969. On the exact distribution of Wilks’ criterion. Biometrika 56, 109–118. SAS, 2008. In: SAS/IMLs 9.2 User’s Guide. SAS Institute Inc., Cary, NC. ¨ Schlittgen, R., Streitberg, H., 2001. In: Zeitreihenanalyse ninth ed. Munchen Wien, R. Oldenbourg Verlag. SPM, September 2009. SPMs 8 Manual. Functional Imaging Laboratory, Wellcome Trust Centre for Neuroimaging, Institute of Neurology, UCL, 12 Queen Square, London WC1N 3BG, UK, /www.fil.ion.ucl.ac.uk/spm/S.

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.