Eye Movement Analysis and Cognitive Assessment

Summary Background: An adequate behavioral response depends on attentional and mnesic processes. When these basic cognitive functions are impaired, the use of non-immersive Virtual Reality Applications (VRAs) can be a reliable technique for assessing the level of impairment. However, most non-immersive VRAs use indirect measures to make inferences about visual attention and mnesic processes (e.g., time to task completion, error rate). Objectives: To examine whether the eye movement analysis through eye tracking (ET) can be a reliable method to probe more effectively where and how attention is deployed and how it is linked with visual working memory during comparative visual search tasks (CVSTs) in non-immersive VRAs. Methods: The eye movements of 50 healthy participants were continuously recorded while CVSTs, selected from a set of cognitive tasks in the Systemic Lisbon Battery (SLB). Then a VRA designed to assess of cognitive impairments were randomly presented. Results: The total fixation duration, the number of visits in the areas of interest and in the interstimulus space, along with the total execution time was significantly different as a function of the Mini Mental State Examination (MMSE) scores. Conclusions: The present study demonstrates that CVSTs in SLB, when combined with ET, can be a reliable and unobtrusive method for assessing cognitive abilities in healthy individuals, opening it to potential use in clinical samples.


Introduction
Cognitive assessment and rehabilitation using ICT (information and communication technologies) is a trending topic in the field of neuropsychology. Considering the ever-increasing need to develop new ways to diminish the impact of cognitive impairments in our daily routines, current research is focused on the development of better technology based assessment and rehabilitation tools. The issues raised by customary tools (mostly based on paperand-pencil) have been profusely described in the literature [1]. One of the most compelling aspects of ICT in this field concern ecological validity since it increases the potential to assess cognitive performance in tasks similar to the ones performed in the "real" context through the use of serious games (SGs) or Virtual Reality (VR) [2,3]. An example is the Systemic Lisbon Battery (SLB) [4,5], or the Computer Assisted Rehabilitation Program-Virtual Reality (CARP-VR) [6], which provide virtual settings with a high similarity to real contexts. Considering the case of the SLB, the main aspect related to ecological validity, that contributes decisively to the positive results found in previous studies, is built on the notion that having users perform virtual tasks in a clinical context, aimed at cognitive stimulation / rehabilitation, that mim ics instrumental activities of daily living (IADLs) increases the transference of the results to the real context [7,8]. Other relevant elements of such VR / SGs applications include higher engagement and motivation. Moreover, the use of such virtual environments may also contribute to the study of mental activity and brain processes through neurophysiological studies [9 -14]. Beyond the use of such ICT solutions, ET is also being used as a tool for improving assessment for cognitive functions such as in anxiety disorders [15], addiction [16], or in neurodegenerative diseases (e.g. Parkinson's, Alzheimer's, frontal-temporal dementia, etc.) [17]. ET is capable of recording real time information concerning cognitive processes [18], which helps rele vant stakeholders to circumvent the limitations of human perception [19]. ET records eye movements and other measures which might lead to a higher degree of comprehension on cognitive processes associated with attention in visual tasks [20]. Even though ET is often used as a control measure, its combined use with VR shows great promise in the treatment and assessment of patients' cognitive limitations [21]. A pilot study was done in order to determine whether ET technology could be used to quantify engagement in real time within a VR task. Two main eye movement features were recorded: average eye movement speed and eye movement total displacement in a given time period. Results revealed that mental engagement levels could be distinguished through oculomotor patterns in a multitude of games [22]. With this application, therapists could adjust in real time the virtual task to better suit the patient's needs and engagement levels. Another study used ET to assess visual attention with children during serious games (SG). Heat maps showed that individuals with weaker performances have a higher fixation density (larger spatial layout of eye movements) than those who performed better. Despite these findings, more studies should be conducted to ascertain stronger associations between eye-movements and performance [23]. ET has been applied in several experimental paradigms as an assessment tool, namely attentional processes [5,15,17]. One of these paradigms is the comparative visual search task (CVST), in which perceptual and attentional strategies can be assessed. However, and according to Galpin and Underwood [24], this experimental task involves not only attention, but also a memory component. In the same line of thought, Irwin [25] stated that details comparison of side-by-side images relies on the process of encoding in memo ry. This perspective is shared by Pomplun [26], advocating that the difference detection between two similar images is served by a typified oculomotor behavior. Because attention is closely linked to memory, eye movements may reflect how observers coordinate visual mnemonic data and how attention is processed and therefore, its analysis may provide a lot of information about the functioning of these cognitive functions.

Objectives
The present study had the aim of assessing whether eye movements vary as a function of the cognitive functioning in comparative visual search tasks (CVSTs), in which both attentional strategies and mnesic processes can be evaluated. Considering the option for non-immersive VR, the authors have in this study adopted this type of VR to prevent substantial problems with dizziness and nausea that have been found in other VR types [27] which would interfere with the assessment of attention and memo ry. Furthermore, the task was already a part of the SLB platform and therefore the authors were interested in understanding how could the ET contribute to a better assessment of cognitive functions within a VRA, if this is the case, ET might be considered as a viable and relevant option in combination with non-immersive VR to promote a better ICT-based cognitive assessment. It was hypothesized that cog nitive abilities (as measured by different MMSE scores across groups) might justify observed differences in six distinct meas ures from the ET and the VRA.

Participants
The sample of this study consisted of 50 university students of the Universidade Lusófona de Humanidades e Tecnologias, in Lisbon. Of these, 74 % were female (n = 37). The average age of the sample was 28.14 years (SD = 11,69), ranging from 18 to 57 years old. Most participants were Portuguese (96 %; n = 48) and reported normal medical history and no visual problems. The main exclusion criteria were: (i) Mini-Mental State Examination (MMSE) [22] scores less than 28, and (ii) history of psychiatric disorders or drug addiction condition. All participants were well informed about the study and signed and written informed consent in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and the 'Ethical Principles of Psychologists and code of Conduct' [23] and with the ethical standards of the Portuguese Board of Psychologists.

Stimuli
The stimuli consisted of two paintings, one of Vincent van Gogh "Vincent's Bedroom in Arles" and the other one, a modified version of the "Son of Man" by René Magritte, both with a 1280 × 1024 resolution. The pair of images was arranged side by side on a brown background (RGB: 68, 40, 22), separated by a distance of 2° of visual angle at the viewing distance of 60 cm. Each ough description can be found in [4]. SLB is available for free at http://labpsicom.ulusofona.pt. (see ▶ Figure 2). SLB and respective CVSTs were displayed through in an Intel core2duo 6550 desktop computer, which was connected to a Tobii-T60 ET System (Tobii Technology AB, Sweden) and integrated into a 17" TFT. Each CVST ended automatically when all the differences were correctly flagged. The eye tracker received the video signal through the VGA capture card from an Intel-based PC equipped with a GeForce GT 220 running SLB. Eye movements were binocularly recorded with a temporal reso lution of 16.7ms (60Hz), with a spatial accuracy of 0.5° of visual angle, during whole the experiment. After both tasks were completed, participants were thanked, debrief ed and dismissed.

Statistical Analysis and Results
Outliers (± 2 SD) were identified from raw eye data and replaced with missing values. The percentage of missing values was lower than 1 % and were randomly distributed across CVSTs. MMSE scores were used to split participants into two normal cognitive level groups: those who presented 1 or 2 errors on MMSE (50.0 %; n = 25) and those that had a maximum score on the MMSE = 30, i.e., no errors (50.0 %; n = 25) Ocular data were previously square root transformed to reduce the importance of extreme values. For time values, as they were positive skewed, a natural log transformation was applied. Normality assumption was guaranteed for every outcome measures (All ps > .05), except for TFSD in the group with no errors on MMSE (p < .05). A multivariate analysis of covariance (MANCOVA) was performed to compare TFD and NV between the cognitive level groups (Group with no errors vs. group with errors), using the participant's age as covariate. The General Linear Model (GLM) approach was chosen due to its robustness against minor violations of the model assumptions. The effect of age was controlled in our statistical model because is considered an important confounder of cognitive performance.
After having successfully filled out the protocol, a 9-point calibration procedure was applied. Participants were instructed to navigate through a VR environment prior to arriving at the art gallery. When they reached the paintings inside the gallery, two CVSTs were presented in a random order, previously generated by the software Research Randomizer (version 4.0) Participants were further instructed to search for the seven differences as quickly as possible and spot them through a mouse click in the reference image (image located at the right side). The pointer was visible and the participant only had to identify the missing elements in the painting that was incomplete. Each time they identified a missing element correctly, a check mark (✓) would be presented in that location of the painting. In the case of an error, no feedback was given to the participant, and the platform would register that as an error. There were neither limits to the number of tries (clicks), nor for task completion time. The task would only be concluded after the seven differences were identified. The two paintings were completely visible at every stage of the task and presented in a 2D static perspective within a VRA, assuring that confounding factors, such as user's position or user's point-ofview in VRA would not distort the comparison between the stimulus images. Each participant was required to perform both CVSTs. SLB is a Unity 2.5-based (Unity Technologies TM ) VRA and a more thor-image subtended 11.81° × 7.63° of visual angle. Both pairs of images depicted several objects with seven differences between them as shown in ▶ Figure 1.

Measures
The protocol included a sociodemographic questionnaire (gender, age, nationality) along with computer knowledge questions and the MMSE for screening general cognitive ability [28]. The MMSE was applied individually, which took around 10 -15 minutes. Three rectangular areas of interest (AoI) were hand drawn around each image (left and right) and in the space between them (interstimulus space). TFD and NV for each AoI were the ocularmetrics used to assess visual attention and mnesic processes. Each visit was defined as the interval of time between the first fixation on the AoI and the next fixation outside the AoI. The time to first spotted difference (TFSD) and total execution time (TET) were computed in seconds. All outcome measures were calculated based on the average of both CVSTs.

Procedure and Apparatus
Upon arriving at the soundproof lab, each participant signed a consent form and was seated at the distance of 60 cm from the eye tracker. The experiment was carried out in a soundproof and at a constant low brightroom (42 lux) during only one session. Bonferroni correction was applied for pairwise comparisons and all tests of statistical significance were done at a 'p' value of . 05 Results of a MANCOVA showed a main effect cognitive level on the composite variable that combined TFD and NV Λ = 0.83, F (2, 47) = 4.78, p = .013. A further univariate analysis revealed a significant effect of the cognitive level on TFD F (1, 35) = 9.02, p = .004, η p 2 = .16 and on NV, F (2, 46) = 7.40, p = .009, η p 2 = .13, after the effect of age has been accounted for. With regard to TFD, the results showed that the group with no errors on MMSE had shorter TFD in both AoI (M = 4.39) than the group with errors on the MMSE (M = 6.06) as has shown in ▶ Figure 2.
In regards to NV, the group with no errors on MMSE performed less visits in both AoI (M = 5.05) than the group with errors on the MMSE (M = 6.50). The covariate, participant's age, was not significantly related to TFD and NV (ps > .05).

Discussion
In this study, it was found that participants with no errors on MMSE presented different oculomotor patterns by showing a faster detailed information processing (extraction) of both images (lower TFD) when compared to participants with errors on the MMSE. This might be explained by low speed-loaded processes presented by the group with errors, which might have needed longer extraction time to reactivate mnesic representations [29]. With regards to other oculormetric, NV, the group that has presented errors on MMSE showed higher NV in comparison to the group with no errors. In CSVTs, switching gaze usually occurs when a comparison is about to be made, and thus NV indicated how many eye movements were involved in encoding before a comparison is elicited and the difference is noticed [30]. Our results support the idea that group with errors might have performed more visits between AoI due to a weak visual working memory usage. As the maintenance and processing of visual information decay quicker in the group with errors on the MMSE, switching saccades between AoI were made in a larger number in order to rehearse the information [24]. Furthermore, the groups with errors performed more visits in the interstimulus space. This is surprising and suggests that participants of this group might have used more the space between images to compare them simultaneously through peripheral attention, demonstrating a particular visual search strategy. In regards to time variables, the results showed that both groups showed same time latency for spotting the first difference, in-dependently of the attentional strategies used during CVSTs. However, the group that has presented errors on MMSE needed more time to complete the CVSTs than the group with no errors on MMSE. These results support the evidence that visual search performance along the task might be attenuated in the group with errors on the MMSE, though healthy. This might be attributable to decrements in some cognitive abilities, especially in attention and memory. However, it is important to address some potential limitations of this study. First, the level of fatigue was not controlled. Due to its interference with attentional processes [20,31], the level of fatigue should be assessed in future studies. Second, our sample was composed only by university students who have similar education and computer experience. In order to generalize our results to a broader population, a larger and heterogeneous sample should be needed.

Conclusion
The findings of this study offer valuable insights into the relationship between cognitive abilities and eye movements. Altogether, the results support the idea that cognitive abilities have an impact on the control of eye movements when performing CVSTs. The application of CVSTs combined with VRA and ET, is undoubtedly valuable, especially when current low-cost ET systems (ET 2.0) allow researchers and clinicians to collect eye data on the natural environments of participants [33], which in turn increases the ecological validity of data. Additionally, since this combined ap- proach was sensitive to subtle changes in cognitive function in healthy individuals, this method might prove to be useful (with future studies focused on the issue) for the detection of mild cognitive impairments or to screen individuals that are in early stages of Alzheimer's disease. Furthermore, future applications using the paradigms and the methodology we have presented, combined with other different types of ocularmetrics such as blinks [33], pupil dilation [20] or even eye vergence [34] might be a richer data source for cognitive assessment.