Confirmatory factor analysis of the Neuropsychological Assessment Battery of the LADIS study: A longitudinal analysis

Age-related white matter changes have been associated with cognitive functioning, even though their role is not fully understood. This work aimed to test a 3-factor model of the neuropsychological assessment battery and evaluate how the model fit the data longitudinally. Confirmatory factor analysis (CFA) was used to investigate the dimensions of a structured set of neuropsychological tests administered to a multicenter, international sample of independent older adults (LADIS study). Six hundred and thirty-eight older adults completed baseline neuropsychological, clinical, functional and motor assessments, which were repeated each year for a 3-year follow-up. CFA provided support for a 3-factor model. These factors involve the dimensions of executive functions, memory functions, and speed and motor control abilities. Performance decreased in most neuropsychological measures. Results showed that executive functioning, memory and speed of motor abilities are valid latent variables of neuropsychological performance among older adults, and that this structure is relatively consistent longitudinally, even though performance decreases with time.


INTRODUCTION
Age-related white matter changes (ARWMC) have been associated with cognitive deficits, mainly in speed of mental processing, executive functions, and memory (De Groot, de Leeuw, & Oudkerk, 2000;Schmidt et al., 2005;Tullberg et al., 2004;Ylikoski et al., 1993). These changes are frequently identified bilaterally on computed tomography (CT) and magnetic resonance imaging (MRI) in the brains of elderly persons, in particular among those with vascular risk factors (LADIS Group, 2011). However, despite the investigations done in the last 30 years, the role that ARWMC play in the progression toward disability in the elderly is still not completely understood. It has become clear, nonetheless, that depending on their severity, these changes are not innocuous (Debette & Markus, 2010;LADIS Group, 2011). In fact, they have been found to relate to functional status (e.g., Inzitari et al., 2009), to cognition (e.g., Jokinen et al., 2009;Verdelho et al., 2010), to mood (e.g., Krishnam et al., 2006), and to motor performance (e.g., Baezner et al., 2008), among others. In a recent systematic review and meta-analysis, Debette and Markus (2010) concluded that WMC were associated with an increased risk of stroke, dementia, and death, clearly indicating a higher possibility of occurrence of cerebrovascular events. In addition, the authors also pointed toward an association of white matter hyperintensities with a faster cognitive decline, including executive functions and processing speed.
The Leukoaraiosis and Disability in the Elderly Study (LADIS) is a longitudinal project that aims to determine the impact of ARWMC on the development of functional, neurological, and cognitive deficits (Pantoni et al., 2004). Within these objectives, a neuropsychological battery was specifically designed for the assessment of the elderly population with ARWMC over a 3-year period (Madureira et al., 2006). According to the exploratory results found in the LADIS baseline data (Madureira et al., 2006), a three-factor underlying structure of neuropsychological performance was proposed. This preliminary exploratory factor analysis accounted for a good amount of variance in the initial evaluation (49%) and was used to compute three compound variables-executive functions, memory, and speed/motor functions. However, adjustments to extracted factors were made for theoretical and interpretability reasons, such as the exclusion of simple timed tests and the addition of the Stroop test (which had loaded on an isolated factor) to the executive functions domain. This structure, therefore, warranted further support, as well as longitudinal analysis. The LADIS study concluded, after an analysis of the 3-year follow-up data, that comparing severe with mild WMC, the risk of transition to disability or death was more than two-fold higher (LADIS Group, 2011). Also, clinically, 90 patients had developed dementia, and 147 had cognitive impairment/no dementia (Verdelho et al., 2010). In fact, there was a greater (3 times) risk for the group of patients who had more severe WMC and lacunae of developing dementia, independent of age, sex, and education (Jokinen et al., 2009).
In the present research, we aim to evaluate the proposed model for the set of neuropsychological tests administered to the LADIS sample, across the 3-year period, using CFA. Hence, the main goals of the paper were (a) to examine whether neuropsychological assessment was supported by three underlying factors, (b) to provide support for the previously reported compound measures of executive functions, memory, and speed and motor abilities; and (c) to evaluate how the model fits the data across the four time-points of this longitudinal study.

METHOD
The LADIS study rationale and methodology have been reported elsewhere (Madureira et al., 2006;Pantoni et al., 2004). Succinctly, it is a longitudinal multinational study involving 11 centers from 10 European countries (see Appendix), aiming to investigate the effect of white matter changes on the transition process to disability. Participant study inclusion criteria were defined as follows: (a) age 65 to 84 years; (b) white matter changes of any degree according to the modified Fazekas visual rating scale (Fazekas, Chawluk, Alavi, Hurting, & Zimmerman, 1987); (c) mild or no impairment on the instrumental activities of daily living (IADL) scale, as indicated by no items or one item (Fazekas et al., 1987); and (d) presence of a contactable informant and agreement to sign an informed consent. Exclusion criteria included: (a) severe medical illness; (b) severe unrelated neurological disease; (c) leukoencephalopathy of nonvascular origin; (d) severe psychiatric disorder; and (e) inability or refusal to undergo brain MRI. Patients were recruited in each center when presenting with minor neurological, cognitive, or motor complaints, or with incidental findings on cranial imaging due to nonspecific reasons (Pantoni et al., 2004).
Participants underwent a comprehensive clinical, functional, motor, and neuropsychological examination at baseline, which was repeated each year for a 3-year follow-up period. Specifically, the assessment included: (a) a standard cardiovascular exam; (b) a standard neurological exam; (c) functional status measured by the IADL scale and the Disability Assessment for Dementia scale (Gelinàs, Gauthier, McIntyre, & Gauthier, 1999); and (d) health-related quality of life measured by the Euro-QoL 5D (Euro-Qol, 1990).
MRIs were also performed at baseline and at the last follow-up visit, 3 years later. A standard protocol (Pantoni et al., 2004) was used, and white matter ratings and volumetric analyses were performed by a single center (Amsterdam; van Straaten et al., 2006).

Participants
Six hundred and thirty-nine participants were included at baseline (from which one participant did not complete baseline assessment). Baseline demographic and clinical characteristics of the LADIS sample have been described elsewhere (Madureira et al., 2006). A summary table with demographic patient characteristics is presented here (Table 1).

Neuropsychological assessment
Participants underwent a standardized neuropsychological evaluation every year of the study. The construction of the neuropsychological battery was described in detail in a previous paper (Madureira et al., 2006). The following tests were included in the battery: (a) the Mini-Mental State Examination (MMSE; Folstein, Folstein, & McHugh, 1975) as a measure of global cognitive status; (b) the Alzheimer's Disease Dementia Scale (ADAS-Cog) to assess memory, orientation, language, and ideational and constructional praxis (Ferris, 2003); (c) the

Statistical analysis
Neuropsychological test scores were measured as continuous variables. The primary data analysis procedure utilized in this paper was Structural Equation Modeling (SEM)-more specifically, confirmatory factor analysis. SEM tests multiple, complex hypotheses at the construct level (i.e., latent variables) in a theory-driven approach, minimizing both Type I and measurement errors (Byrne, 2001). Measurement models can be tested through CFA. The main hypothesized models were evaluated using AMOS 7.0 (Arbuckle, 2006). This program was chosen due to the method used to deal with missing values in covariance structure modeling, which are abundant in longitudinal data (Byrne, 2001). Instead of performing listwise or casewise deletion of cases with any missing data, which would decrease sample size and power, maximum likelihood estimation is used for incomplete datanamely, the full-information maximum likelihood (FIML) method. The following indices were utilized to evaluate the overall model goodness of fit in this paper: chi-square (χ 2 ), the root mean square error of approximation (RMSEA; Browne & Cudeck, 1993), the comparative fit index (CFI; Bentler, 1990), the normed fit index (NFI), and the Tucker-Lewis coefficient (TLI; also known as the Bentler-Bonett non-normed fit index, NNFI; Bentler, 1990). While the conventional χ 2 test is too stringent and tests for a perfect fit of the data to the model, the other three indices provide information on good or close fit to the data. In addition, fit can be indicated by the χ 2 to degrees of freedom ratio (CMIN/DF) in the range of two-to threeto-one (less than five). The RMSEA index ranges from 0.00 to 1.00. A value of the RMSEA of about .05 or less indicates a close fit of the model in relation to the degrees of freedom, and a value of .08 or less indicates reasonable fit. Conversely, an RMSEA greater than .10 can be interpreted as an unacceptable fit of the model. On the other hand, both the CFI and the NFI/NNFI also range from 0.00 to 1.00, but greater values translate better fit. Only values close to 1.00 (greater than .90) translate close or good fit (Browne & Cudeck, 1993;Hu & Bentler, 1999;MacCallum & Austin, 2000).

Descriptive statistics
Rates of attrition and assessment completion for baseline and follow-up years of the LADIS study were very acceptable, especially considering the population involved (75% came to the last followup visit). Descriptive statistics (means and standard deviations) for neuropsychological tests are presented in Table 2 for all the years of the study.
Analyses of variance (ANOVAs) with repeated measures were able to show that neuropsychological performance decreased with time, as expected. An exception were the indicators  The measures utilized were also correlated with another measure of global cognitive functioningnamely, the MMSE-for further convergent validity; MMSE change was also significant, F(1, 421) = 7.21, p = .000. Executive functioning indicators were significantly correlated with the MMSE for all time-points (r trail B -A ranged between -.42 and -.23, p < .01; r stroop3 -2 ranged between -.45 and -.35, p < .01; r symbol digit ranged between .47 and .54, p < .01; r verbal fluency ranged between .43 and .53, p < .01). Indicators of speed and motor control were also significantly associated with global cognition, with the exception for Maze at Year 1 and Year 2 (r trailA ranged between -.41 and -.55, p < .01; r maze ranged between .03, ns, and -.55, p < .01; r digit cancel ranged between .36 and .50, p < .01). Finally, memory performance indicators were found to be significantly associated with the MMSE (r word recall ranged between -.34 and -.48, p < .01; r delayed recall ranged between -.35 and -.48, p < .01; r word recognition ranged between -.31 and -.54, p < .01; r digit span ranged between .37 and .40).

Baseline neuropsychological assessment
Using neuropsychological scores at baseline, a three-factor model was tested, following the domains proposed in an earlier paper (Madureira et al., 2006). This model involved three latent variables: executive functions, memory, and speed and motor control (see Figure 1). The model fit the data well, with fit indexes within a good range: χ 2 (40) = 171.10, p < .001, with CMIN/DF = 4.28; NFI = .93; NNFI = .91; CFI = .95; RMSEA = .07 (see Table 3). The model presented factor loadings that were all found significant at a .01 level, as well as the correlations among the three latent variables. These correlations were in the expected direction. Single subtests accounted for a significant amount of variance, ranging from 14% (Word Recall) to 89% (Symbol Digit).

Neuropsychological assessment at the end of the 1st year
An identical set of analyses was conducted with data of the same neuropsychological tests, performed at 1-year follow up. The three-factor model was tested (see Figure 2). The model fit the data modestly, with the following fit indexes: χ 2 (40) = 280.20, p < .001, with CMIN/DF = 7.00; NFI = .89; NNFI = .83; CFI = .90; RMSEA = .10 (see Table 3). In addition, factor loadings were all found to be significant at a .01 level, with the exception of Maze, significant at .05. The expected direction of factor loadings and the weight and direction of correlations among the latent variables were also found to be significant. Single subtests accounted for a significant amount of variance, ranging from 14% (Word Recall) to 87% (Symbol Digit), with the exception of Maze (only 3%).

Neuropsychological assessment at the end of the 2nd year
Following a similar procedure as before, the three-factor model was tested (see Figure 3). The model fit the data adequately, with fit indexes within an acceptable to good range: χ 2 (40) = 186.90, p < .001, with CMIN/DF = 4.67; NFI = .91; NNFI = .89; CFI = .93; RMSEA = .08 (see Table 3). Furthermore, once again all factor loadings were found to be statistically significant (Maze was the poorest, but still significant at .05 level). Correlations among factors were significant at .01 level and followed the same directions as baseline and Year 1. Single subtests accounted for a significant amount of variance, ranging from  16% (Word Recall) to 80% (Symbol Digit), with the exception of Maze (only 2%).

Neuropsychological assessment at the end of the 3rd year
The same battery of tests was administered to the participants at the last year of the study (Year 3). The analysis using CFA followed the same procedure. The three-factor model was tested, achieving only acceptable but poorer fit indices: χ 2 (40) = 300.40, p < .001, with CMIN/DF = 7.51; NFI = .87; NNFI = .81; CFI = .89; RMSEA = .10 (see Table 3). Moreover, just as in the three-factor models for the previous time-points, factor loadings revealed significant scores, as well as the correlations among the factors. Single tests accounted for a significant amount of variance, ranging from 17% (Word Recall) to 80% (Symbol Digit; see Figure 4).
This set of analyses provided some support to configural invariance across time-points. However, in order to further explore the question of longitudinal measurement invariance, three additional models were performed using MPlus (Muthén & Muthén, 1998-2007. Each of these models tested all time-points simultaneously, and their focus was the comparison across groups (i.e., time). They included: Model 1, configural invariance (same pattern of free loadings); Model 2, weak measurement invariance (fixed loadings across groups); and Model 3, strong measurement invariance (fixed loadings and intercepts across groups). The fit indices of these models are presented in Table 4.
The analysis of fit indices reveals that the models provide reasonable but modest fit to the data. CFI and TLI indices ranged between .824 and .869,  which were lower than .90 (relatively acceptable fit). However, the weak (84 free parameters) and strong (78 free parameters) measurement invariance models provided an acceptable (.08-.10) RMSEA, slightly better than the simple configural invariance (144 free parameters). Hence, overall, even though the fit was not good, we are able to accept measurement invariance across time-points.

DISCUSSION
The present study was guided by one primary goal: the confirmatory analysis of a structured set of neuropsychological tests in order to support, across time, the validity of the neuropsychological compound scores for three dimensions-namely, executive functions, processing speed, and memory. Through CFA, this study provided support for the construct and measurement validity of the compound measures across the four years of the study. As aforementioned, CFA allows for focused hypothesis testing and decreases the likelihood of chance findings. Cross-validation and missing values in longitudinal data are also suitably dealt with. This has led many researchers to use this methodology in the study of the organization of neuropsychological functions (de Frias et al., 2006;Hull et al., 2008). Nonetheless, adjustment of the data to the threefactor structure seemed poorer at Year 3 than at baseline and Year 2 (where the model fit the data closely). This may be due, on the one hand, to lower statistical power (i.e., a number of patients missed the assessment in one or two of the study years). Attrition bias, including by death of participants, has been argued to produce underestimates of cognitive decline and to limit the interpretability of cognitive change (Ritchie & Tuokko, 2007). On the other hand, however, the poorer fit of the proposed structure to the data obtained in the last year may be due to other variables that may have influenced the participants' performance on neuropsychological tests, such as motor slowing (Baezner et al., 2008). Finally, the fit of the longitudinal invariance measurement models was only modest. Hence, while supporting the dimensions of executive functioning, memory, and processing speed, the results still warrant some caution regarding the assumption that these latent constructs reflect the same meaning over time.
ANOVA results revealed that, overall, performance decreased with time for all neuropsychological tests and that test values correlated with general indices of functional status and global cognition. These findings provide further support for the construct validity of the measures and are in tune with extant literature on cognitive aging and ARWMC (Madureira et al., 2006;Pantoni et al., 2004). In fact, the clinical relevance of this decrease and its relation to changes in white matter hyperintensities have been documented in other studies (Debette & Markus, 2010;LADIS Group, 2011).
The current study contributes to the literature in a number of ways. First, the present paper provides support for the construct validity of the neuropsychological battery of the LADIS study, in a large, heterogeneous, multinational sample of independent elderly. It determined that the factor structure of the battery supports the proposed three domains. Second, and most importantly, the factor structure was tested longitudinally. The findings were, thus, cross-validated in four different datasets, which further supports its structure. We have shown that, even though neuropsychological performance of independent elderly decreases across time, the structural functioning is relatively consistent (i.e., groupings of functions continue to cluster together). We also found a decrease in model fit with time, which may suggest that the structure of functioning may be sensitive to some changes that occur in later life, such as ARWMC and the onset of cognitive impairment.
Although there are a number of strengths, some limitations can be identified in the present study. One limitation was that, because of small subgroup sizes, the model could not be tested separately on the group of patients who evolved to dementia and those who evolved to impairment/no dementia by the end of the study. Another limitation, as mentioned, was participant attrition over time.
Results suggest there are unique ways in which neuropsychological functions may group together over time, especially at the onset of mild cognitive impairment or dementia. Future research should continue to investigate the complex ways in which change in cognitive functioning may take place, such as by the addition of moderation analyses (e.g., dementia/nondementia) or models of latent change (Willett & Sayer, 1994). In addition, the nature or meaning of these constructs, such as executive functioning, memory, and processing speed, may be further explored over time, as they may reflect slightly different processes.
Original manuscript received 10 January 2012 Revised manuscript accepted 24 January 2013 First published online 8 February 2013