Cognitive performance among breast cancer survivors treated with aromatase inhibitors

Background: The use of aromatase inhibitor (AI) therapy is increasing due to efficacy in decreasing breast cancer recurrence. There are theoretical reasons for concern about the effects of AI’s on cognition, but few empiric studies have addressed this issue. We sought to assess cognitive function in relation to AI exposure. Methods: Patients aged 65 or older at the time of an incident early-stage breast cancer underwent a battery of neuropsychological tests, as well as functional magnetic resonance imaging (fMRI) during an N-back cognitive task, which incrementally varies working memory load. Of 22 subjects, 11 had been treated with an AI for a mean of 37 months (AI group); 11 had no-AI exposure (no-AI group). The groups had similar distributions of age and educational achievement; no subject had received chemotherapy. Results: The AI group did not differ significantly from the no-AI group on neuropsychological test scores. However there was a trend toward worse cognitive function among the AI subjects, who demonstrated worse performance on all 12 tests with a Cohen’s d effect size of 0.3 or greater. During fMRI testing, the AI group failed to exhibit the expected pattern of increased regional blood oxygenation as the complexity of the working memory task increased. In contrast, the no-AI group showed the normally expected pattern of increased oxygenation as the demands on working memory increased. Conclusions: This study suggests the possibility of adverse effects of AI exposure on cognition, and suggests that fMRI may be sensitive to early changes in cognitive performance.


Background
Based on a reduction in breast cancer recurrence of about 50% in several randomized trials, the American Society of Clinical Oncology (ASCO) has recommended that therapy for postmenopausal women with hormone receptor-positive breast cancer include an aromatase inhibitor (AI) [1]. Since about 75% of the 180,000 women who develop breast cancer each year are aged 50 or older [2], and since the majority of postmenopausal women have hormone receptor-positive disease, this recommendation is likely to result in the exposure of a large number of women to AI agents.
Aromatase inhibitors markedly reduce circulating estrogen levels in postmenopausal women and little is known about their adverse effects. The ASCO technology assessment paper recommending AI agents specifically noted that information regarding the potential effects of AI's on cognition was lacking [1]. Estrogen receptors are found in many areas of the brain, and considerable evidence over the years has supported a positive relationship between estrogen and cognitive brain function, particularly with respect to verbal memory, attention, processing speed and working memory [3][4][5][6]. The evidence for this relationship was promising enough that the Women's Health Initiative Memory Study (WHIMS) randomized trial was conducted to evaluate the effect of estrogen on memory in a subset of the Women's Health Initiative trial [7]. Unexpectedly, this study not only failed to find a beneficial effect of estrogen, it found estrogen therapy associated with an increase in a combined endpoint of dementia and mild cognitive impairment [7].
The WHIMS findings have been interpreted by some as evidence that AIs would be unlikely to cause cognitive problems. However, this is not necessarily the case. More detailed analyses of subsets of the WHIMS participants showed the possibility of different effects of estrogen with or without progesterone on different cognitive domains [8,9]. There may be an optimal level of estrogen exposure, over which deleterious effects predominate [10]. The aromatase enzyme is widely distributed in neural tissue, [6,11] and forms estrogen locally in neural tissue by converting precursor androgens [5]. Both estrogen and aromatase levels are markedly reduced in the autopsied brains of women with Alzheimer's Disease compared to control subjects without Alzheimer's Disease [5]. Given these doi: 10.7243/2049-7962-2-7 considerations, it does not seem prudent to assume that the results of the WHIMS trial would necessarily imply the safety of the aromatase class of agents with respect to cognition [12]. Rather, that evidence should be determined empirically.
Assessing cognitive function through traditional neuropsychological testing is useful, but may be somewhat insensitive to adverse effects of a pharmacologic agent. Functional magnetic resonance imaging (fMRI) is a relatively new tool for noninvasive imaging of human brain function. Without the use of contrast agents, fMRI detects regional MR signal increases related to decreases in deoxyhemoglobin due to local increases in blood flow while accomplishing a task. Functional MRI imaging may prove to be more sensitive than traditional neuropsychological testing for detecting early changes predisposing to cognitive difficulties [13].
In this study, we compared cognitive function, as measured by traditional neuropsychological testing and by fMRI testing, in a cohort of breast cancer survivors who had been exposed to AI agents compared to a control cohort of breast cancer survivors without AI exposure. We hypothesized that fMRI might show evidence of the development of cognitive impairment even if not detected by traditional neuropsychological testing.

Participants
Participants were recruited from a larger cohort of women participating in a longitudinal National Cancer Institutesponsored survey study examining breast cancer outcomes among women aged 65 or older at the time of development of breast cancer in 2003 [14]. Subjects were considered for participation in this study if they were community-dwelling, resided within 100 miles of our academic health center, had provided valid responses to survey questions regarding use of hormonal therapies for their breast cancer, and had not received any chemotherapy as part of their breast cancer treatment. The study was approved by the Medical College of Wisconsin IRB and written informed consent was obtained. All participants received financial compensation.

Neuropsychological Testing
A battery of standardized neuropsychological tests was administered to participants in a fixed order over approximately 90 minutes. The specific tests administered are listed in Table 1. Unpaired T-tests were conducted on each neuropsychological domain. We applied a more conservative 0.01 alpha level to account for multiple comparisons.

fMRI Task
The N-back task is a parametrically designed sequential letter task, which incrementally varies working memory load at four levels: 0-back, 1-back, 2-back and 3-back [15]. In each run, participants were presented with a pseudorandom series of 120 upper-or lower-case letters every 3 seconds in 45-second blocks. The stimuli were presented for a duration of 500 ms with an interstimulus interval of 2500 ms. Each run began with a 24-second rest period, followed by four blocks of N-back in pseudorandom order, a rest period, four more blocks of N-back, and rest. While the blocks of stimuli were similar, task difficulty varied for each block. In 0-back blocks, participants were instructed to press the index finger when the letter "d" appeared. In 1-back blocks, participants were instructed to make a key press whenever a letter appeared that was the same as the letter preceding it. Similarly, in 2-back blocks, the target was a letter that was the same as the antepenultimate letter, and in 3-back blocks, the preantepenultimate letter. The 0-back condition, which does not involve working memory but does require attention to a task, served as a control condition. Subjects were provided practice runs prior to scanning to ensure that they correctly understood the task.
To evaluate performance on the N-back task, measurements were made of the accuracy and reaction time associated with each condition, across all imaging runs. Analyses of variance (ANOVA) were performed, examining main effects and interactions of working memory load (0-back, 1-back, 2-back and 3-back) and treatment group.
Visual stimuli were computer-generated and rear-projected (Sharp model XG2000U video projector) on an opaque screen located at the subject's head. Subjects viewed the screen through prism glasses attached to the head coil. A non-ferrous keypress device made from four force-sensing resistors was used to record accuracy and reaction time through the parallel port of an IBM compatible computer.

fMRI Analysis
To evaluate hypotheses regarding neural activation patterns, the raw fMRI signal data were analyzed using Analysis of Functional NeuroImages (AFNI) [16]. Each image time series was time-shifted to the beginning of the TR and then spatially registered to reduce the effects of head motion using a rigid body iterative linear least squares method. Multiple regression doi: 10.7243/2049-7962-2-7 was used to extract a hemodynamic response magnitude for each of the four stimulus conditions (0-,1-,2-or 3-back). To compensate for normal variation in anatomy across subjects, functional images were blurred using a 6 mm Gaussian fullwidth half-maximum filter, reformatted into 1 mm isotropic voxels and transformed into standard stereotaxic space [17,18].
Statistical analysis consisted of a voxel-wise, 2 X 4 (Group by Back) repeated measures analysis of variance (ANOVA). An individual voxel probability threshold of 0.003 was applied in conjunction with a minimum cluster size threshold of 0.506 ml [19] to minimize false positive activation foci from the brain maps. These two threshold values were derived from a Monte Carlo simulation (3,000 iterations) using the AFNI AlphaSim program to achieve a family-wise error of p < 0.05.

Sample Characteristics
Of 107 women contacted, 55 declined participation citing long distance (44), lack of transportation or caregiver to accompany them (8) and excessive time commitment (3) as the main reason for declining. Of the 52 who agreed to consider participation, 26 were excluded due to: claustrophobia (12), metal in body (5), physically unable to lie down in the scanner (4), height/weight incompatible with the scanner (2), cancer other than breast (1), low visual acuity (2). Of the 26 who participated, 4 were excluded from analyses due to: inability to complete fMRI (1), technical problems with the fMRI analysis (2), and receipt of AI therapy for a single month (1).
Of the 22 included in analyses, 11 had received adjuvant hormonal therapy including at least 12 months AI therapy ("AI group"). The remaining 11 had received no AI therapy ("No-AI group"). Three AI subjects had a history of tamoxifen treatment for a median of 21 months. Three No-AI subjects had tamoxifen exposure: two were currently using it and one had used it until 9 months prior to testing. The median tamoxifen exposure for these three subjects was 42 months. Ages for the entire sample ranged from 69 to 80 years, and education levels from 12 to 18 years. Within the AI group, the mean age was 74.6 years (SD = 2.4), mean education level was 14.6 years (SD = 2.5), and the mean duration of AI treatment was 37 months (SD = 12.1). All women were still on the AI at the time of testing, except for one who stopped AI therapy one month prior to testing. Within the No-AI group, the mean age was 75.6 years (SD = 2.5), and mean education level was 14.8 years (SD = 2.5). The two groups did not differ in terms of age (t = -.77, p = .45) or education (t = -.17, p = .86). There were no significant differences between groups in the initial stage of disease (41% localized, 54% regional, 5% unknown), receipt of mastectomy (41%), time from surgery to testing (mean of 41 months, SD=3) or of handedness (91% right-handed).

Neuropsychological and fMRI Test Performance
Results of the neuropsychological test battery are detailed in Table 1. No statistically significant differences were found between the groups; however, a pattern of poorer performance emerged in the AI group. Because of the small sample size, we examined the effect size, Cohen's d, as an indicator of performance. For any effect size of .3 or greater, the direction of the performance was indicated in the "Performance Comparison" column of Table 1. For all 12 neuropsychological test measures with an effect size of 0.3 or greater, the AI group performed more poorly than the No-AI group, suggesting that group differences might have emerged with larger sample sizes.
Performance on the N-Back task is illustrated in Figure 1. The effects of working memory load and treatment group and their interactions were analyzed using separate analyses of variance (ANOVAs) performed on the accuracy (percent correct) and reaction time (msec) dependent measures. As expected, the main effect of working memory load for accuracy (F 3,42 =28.2, p < 0.0001) and reaction time were highly significant (F 3,42 =19.6, p < 0.0001). For both the accuracy and

fMRI Results
Regions showing significant activation differences as a function  Table 2 for region volumes and stereotaxic coordinates. of working memory load were identified for the AI and no-AI groups separately using voxelwise one-way ANOVAs across the N-back conditions (Figure 2 and Table 2). In the no-AI group, 26 regions demonstrated a pattern of activation that increased with working memory load (depicted in red on Figure 2). In the AI group, only one relatively small region in the left posterior cingulate showed a significant correlation with working memory difficulty (Figure 2, Table 2). In a subgroup analysis omitting the three subjects in each group with current or prior exposure to tamoxifen, the results were essentially unchanged.
On a voxel-wise 2 (Group) X 4 (n-back) repeated measures ANOVA, significant interaction effects were observed in three regions typically associated with working memory: left and right middle frontal and left inferior frontal regions (Figure 3, Table 3). Whereas the no-AI group showed the expected pattern of increasing levels of activation as working memory load increased, the AI group showed a flat pattern in which brain activation did not increase with task difficulty.

Discussion
In this study, breast cancer survivors with a mean of just over three years of exposure to an AI did not differ statistically from breast cancer survivors with no AI exposure with respect to traditional neuropsychological testing. However, a trend toward worse cognitive function among the women exposed to an AI was found on most of these measures, as demonstrated by moderate to large effect sizes (Cohen's d > 0.3). In contrast, on fMRI, the number of brain regions demonstrating the expected increase in fMRI activation with working memory load was considerably greater in the no-AI group (n=26) than the AI group (N=1). In a separate voxel-wise analysis, three working memory regions in the frontal cortex demonstrated the expected increase in fMRI activation [20] with memory load in the breast cancer survivors with no prior AI exposure. The AI group, however, failed to demonstrate an increase in activation as the demands on working memory increased in these three regions.
This pattern of aberrant activation has been observed in other conditions involving mild brain dysfunction. Using the N-back task in recently injured subjects with mild traumatic brain injury (TBI), McAllister and colleagues observed a smaller increase in bilateral frontal and parietal activation during the high-processing load conditions when compared with controls [21]. They hypothesized that limitations in the allocation of doi: 10.7243/2049-7962-2-7 processing resources occur at the highest levels of N-back difficulty in their TBI patients. We speculate that a similar effect may be present in AI-exposed patients.
Only a small number of papers have reported directly on cognitive function among women treated with AIs, and these have conflicting results. Jenkins and colleagues [22] evaluated traditional neuropsychological measures in a sample of women participating in a randomized trial of anastrozole (an AI agent), tamoxifen, or both, and compared them to a controls without breast cancer. The breast cancer patients were impaired with regard to processing speed and immediate verbal memory. It is not clear what percentage of the subjects had exposure only to tamoxifen, and not to the AI agent. Bender and colleagues [23] studied 31 postmenopausal women aged 21 to 65, and compared anastrozole recipients to tamoxifen recipients. Some of the women in each group had received cytotoxic chemotherapy. The women taking anastrozole had worse cognitive function than the women taking tamoxifen. The IBIS II study evaluated cognition prospectively in 207 women at high risk for breast cancer, who were randomized to prophylactic therapy with anastrozole or placebo [24]. At 24 months, cognitive performance did not differ. These women were much younger (mean age 57 years) than our cohort, as well as having a shorter exposure time to the AI. The older age of our cohort is worth noting, as older women may be more vulnerable to loss of cognitive function via a number of pathways [10].
The current report advances the literature by evaluating cognitive function not only with traditional neuropsychological measures, but also with task-activated fMRI. We were able to show dramatic group differences in the pattern of brain activation with only 11 patients in each group. These fMRI group differences were observed in the absence of significant effects on neuropsychological testing and suggest that this functional imaging tool may be more sensitive in detecting the neurocognitive brain changes than standardized neuropsychological tests in the context of a randomized clinical trial. To illustrate this point, for a given neuropsychological test with an estimated effect size of 0.7, a clinical trial with 80% power would require 50 subjects per group; for a test with an effect size of 0.3, a sample size of 262 subjects would be required per group. In contrast, we observed clear group differences using fMRI on only 11 patients per group. This study has limitations. The sample size was small, so the power to detect statistically significant effects between the treated and untreated groups in the neuropsychological battery of tests was modest. This may account for the lack of significant group differences observed on traditional neuropsychological measures. However, the modest sample size makes the differences observed with fMRI that much more remarkable. The patients were not randomly allocated to treatment with the AI agent, so it is possible that there was selection bias with respect to the treatment groups. Specifically, one might expect that the women treated with the newer AI agents would have higher educational attainment and cognitive performance [40,41]. However, the age and educational distribution of the treated and untreated subjects was very similar.
A further consideration is that the study was cross-sectional in design, and cognitive measures were not available prior to treatment. While a longitudinal design would be preferable, the control group did consist of breast cancer survivors. This is an important advance over some prior studies as breast cancer survivors may have different cognitive performance than women without breast cancer [25]. Finally, three of the subjects in both the AI and no-AI groups had undergone prior treatment with tamoxifen. Tamoxifen may cause cognitive decline, although the current literature is conflicting [4,6]. However, the No-AI subjects had double the tamoxifen exposure of the AI subjects. Therefore, if tamoxifen does cause cognitive decline, the No-AI group should have exhibited worse cognition than the AI group. For this reason, tamoxifen exposure would not explain the abnormal fMRI findings among the AI group.

Conclusions
In summary, women treated with AI agents exhibited an abnormal pattern of fMRI brain activation and a tendency toward worse cognitive performance on traditional neuropsychological measures than women not treated with AI agents. The use of these agents is likely to increase dramatically due to their efficacy in preventing breast cancer recurrence among postmenopausal women with hormone receptor-positive cancer. Results from this pilot study indicate the possibility of deleterious effects on brain function of AI agents that require replication in a larger trial. If the results are confirmed, one could imagine a future role of fMRI in screening patients who are on AI medications for early decrements in cognitive function so that the individual risk-benefit ratio can be reassessed. Patients with early cognitive decrements might also be good candidates for participation in trials of agents designed to ameliorate such effects.