INTRODUCTION
The clinical emergence of Alzheimer's disease (AD) is typically heralded by an insidious and progressive tendency to forget recent day-to-day events, such as conversations. Such events comprise various sensory, emotional, and semantic attributes that are initially processed by different regions of the brain. The rapid formation of lasting associations between these attributes underpins normal memory formation and represents a fundamental role of the medial temporal lobe (MTL), which typically undergoes neurodegenerative changes early in the course of developing AD (Braak & Braak, Reference Braak and Braak1991; Hyman et al., Reference Hyman, Van Hoesen, Kromer and Damasio1986). Severe episodic memory impairment has been consistently reported as an early sign of emerging AD (Elias et al., Reference Elias, Beiser, Wolf, Au, White and D'Agostino2000; Grady et al., Reference Grady, Haxby, Horwitz, Sundaram, Berg, Schapiro, Friedland and Rapoport1988), and typically represents the most impaired cognitive domain.
With the advent of pharmacotherapy for AD, which is most efficacious in the early stages of the disease process (Brookmeyer et al., Reference Brookmeyer, Gray and Kawas1998; Giacobini, Reference Giacobini2001; Michel et al., Reference Michel, Zekry, Mulligan, Giacobini and Gold2001), considerable research effort has been directed toward facilitating earlier diagnosis of AD. Hodges (Reference Hodges1998) commented that research on dementia has demonstrated that many of AD's diagnostic differentials have their own individual neuropsychological signature (i.e., pattern of cognitive strengths and weaknesses), particularly in the early stages of these conditions. The progressive emergence of clinical signs generally reflects the distribution of underlying neuropathological changes associated with each condition (e.g., frontotemporal dementia, Lewy Body dementia) Whereas the majority of these differentials have “memory impairment” as a cardinal feature of their neuropsychological profile when the term is used broadly, they often differ neuropathologically from AD, in that MTL structures are essentially intact or are significantly less affected in their early stages in comparison to that seen in early AD.
Considerable research has been targeted toward characterizing the early memory impairment in Alzheimer-like and non-Alzheimer forms of dementia. The memory impairment in AD has been characterized in terms of defective encoding/consolidation because of early MTL damage (Braak & Braak, Reference Braak and Braak1991; Hyman et al., Reference Hyman, Van Hoesen, Kromer and Damasio1986), rather than an inability to access or retrieve material from memory (Albert, Reference Albert1981; Greene et al., Reference Greene, Baddeley and Hodges1996; Zec, Reference Zec, Parks, Zec and Wilson1993), although inefficient memory retrieval processes may pose a further source of impairment in AD (Dalla Barba, Reference Dalla Barba1997). As such, probable AD patients characteristically perform below healthy controls on memory tests requiring recall or recognition of recently presented material (Greenaway et al., Reference Greenaway, Lacritz, Binegar, Weiner, Lipton and Cullum2006; Greene et al., Reference Greene, Baddeley and Hodges1996; Hodges & Patterson, Reference Hodges and Patterson1995; Locascio et al., Reference Locascio, Growdon and Corkin1995; Ribeiro et al., Reference Ribeiro, Guerreiro and De Mendonca2007; Tierney et al., Reference Tierney, Black, Szalai, Snow, Fisher, Nadon and Chui2001). Memory impairment in many non-Alzheimer dementias (e.g., depression, vascular dementia, alcohol related dementia, Parkinson's disease), on the other hand, involves relative difficulty in retrieving information from memory rather than encoding/consolidating the information in the first place. In these patient groups recognition-based memory performance tends to be considerably less impaired than performance on recall tests (e.g., Lachner et al., Reference Lachner, Satzer and Engel1994; Mormont et al., Reference Mormont, Laurier-Grymonprez, Baisset-Mouly and Pasquier2003; Tierney et al., Reference Tierney, Black, Szalai, Snow, Fisher, Nadon and Chui2001), as the recognition format is believed to circumvent effortful retrieval demands (e.g., Butters et al., Reference Butters, Wolfe, Martone, Granholm and Cermak1985).
Item-based recognition tests, involving the discrimination of familiar targets from unfamiliar (i.e., novel) distractors, have not proven to be as sensitive to the early stages of AD as delayed recall measures (Dalla Barba, Reference Dalla Barba1997; Lipinska & Backman, Reference Lipinska and Backman1997; Tierney et al., Reference Tierney, Szalai, Snow, Fisher, Nores, Nadon, Dunn and St. George-Hyslop1996; Welsh et al., Reference Welsh, Butters, Hughes, Mohs and Heyman1991). In some studies of recognition memory where statistically significant group differences have been reported between AD patients and control groups, the effect size of the difference is so small that the clinical significance of the difference is questionable (e.g., 9.05 vs. 9.77 out of a total score of 10 words in Chen et al., Reference Chen, Ratcliff, Belle, Cauley, DeKosky and Ganguli2000). Item-recognition measures have been reported as lacking sensitivity for MTL pathology in patients who are considered amnesic using other measures (Reed & Squire, Reference Reed and Squire1997). Lesion and functional imaging studies support that item-recognition is less dependant on “higher-order” MTL structures like the hippocampus than recall-based measures (Mayes et al., Reference Mayes, Holdstock, Isaac, Hunkin and Roberts2002; Montaldi et al., Reference Montaldi, Spencer, Roberts and Mayes2006). Item-recognition measures are also commonly affected by the occurrence of ceiling effects in control groups (Hodges, Reference Hodges, Tulving and Craik2000; Piercy & Huppert, Reference Piercy and Huppert1972; Welsh et al., Reference Welsh, Butters, Mohs, Beekly, Edland, Fillembaum and Heyman1994).
Rather than relying on existing measures of recall and recognition memory with their established limitations, we have argued that researchers and clinicians should be striving to develop more sensitive and specific diagnostic tools for detecting the signature memory deficits present in preclinical AD (Lowndes & Savage, Reference Lowndes and Savage2007). The most widely accepted theory of hippocampal functioning is that it receives, actively binds together and encodes the complex relational structure of personal experiences (Cohen et al., Reference Cohen, Ryan, Hunt, Romine, Wszalek and Nash1999; Henke et al., Reference Henke, Buck, Weber and Wieser1997; Wallenstein et al., Reference Wallenstein, Eichenbaum and Hasselmo1998). Paired associate learning (PAL) tasks assess the ability to rapidly form and remember associations between attributes of an experience, and they are sensitive to MTL dysfunction (Cohen & Eichenbaum, Reference Cohen and Eichenbaum1993). On theoretical and empirical grounds, PAL tests would seem an ideal choice for detecting the MTL-based memory impairment characteristic of early AD, and the paradigm has previously been shown to be sensitive in the preclinical stages of AD (Fowler et al., Reference Fowler, Saling, Conway, Semple and Louis2002; Lee et al., Reference Lee, Rahman, Hodges, Sahakian and Graham2003; Swainson et al., Reference Swainson, Hodges, Galton, Semple, Michael, Dunn, Iddon, Robbins and Sahakian2001). Standardized PAL tasks are typically administered using a cued-recall format, however, and many healthy elderly people find this paradigm difficult (Dunlosky & Hertzog, Reference Dunlosky and Hertzog1998; Salthouse, Reference Salthouse1994, Reference Salthouse1995), as do memory impaired patients without MTL damage, including those with depression (Golinkoff & Sweeney, Reference Golinkoff and Sweeney1989) and a range of other neurological conditions (Kinsbourne & Winocur, Reference Kinsbourne and Winocur1980; Salmond et al., Reference Salmond, Chatfield, Menon, Pickard and Sahakian2005; Squire & Shimamura, Reference Squire and Shimamura1986). Although cued-recall PAL may be sensitive to early AD, poor performances are not necessarily because of MTL dysfunction, rendering its clinical usefulness as tool for detecting AD unacceptably poor because of its low specificity.
Minimal research has been conducted on the ability of the associate-recognition paradigm to discriminate AD from non-AD related memory impairment (see Gallo et al., Reference Gallo, Sullivan, Daffner, Schacter and Budson2004; Lee et al., Reference Lee, Rahman, Hodges, Sahakian and Graham2003). Associate-recognition involves paired-associate learning and recognition that a pair of items occurred together in a previously presented list. For example, in verbal variants participants learn a list of word pairs (e.g., horse-forest, ship-seat) and must then discriminate between the intact pairs (e.g., horse-forest) and rearranged pairs (e.g., horse-seat). The paradigm should be highly sensitive to the binding processes carried out by the MTL, while being less dependent on effortful memory retrieval processes than cued-recall PAL because there is no requirement to explicitly retrieve the individual words from the learning episode. In other words, associate-recognition should be as sensitive to AD as cued-recall PAL tasks, but it may have the benefit of being more specific to MTL-related memory impairment than cued-recall PAL. A number of researchers have suggested that associate-recognition tests may also be more sensitive to the MTL as a functional unit than item-recognition tests. Being fundamentally relational in nature, they require the recollection of more contextual information from the original learning event and therefore load more heavily on hippocampal processes than item-recognition (Aggleton & Shaw, Reference Aggleton and Shaw1996; Davachi, Reference Davachi2006; Mayes et al., Reference Mayes, Holdstock, Isaac, Montaldi, Grigor, Gummer, Cariga, Downes, Tsivilis, Gaffan and Norman2004; Tulving & Markowitsch, Reference Tulving and Markowitsch1998; Vargha-Khadem et al., Reference Vargha-Khadem, Gadian, Watkins, Connelly, Van Paesschen and Mishkin1997).
On this background, the first aim of the current study was to compare the ability of verbal associate-recognition and traditional verbal cued-recall PAL to differentiate a group of early AD patients from healthy elderly (HE) participants. This comparison represents a critical first step in establishing the paradigm's broader clinical utility. We also investigated the effect of imageability of the test stimuli in discriminating AD patients from elderly controls. A concreteness variable was introduced in order to parallel a concrete/abstract sub-structuring apparent in the PAL subtest of early versions of the Wechsler Memory Scale. Paivio et al. (Reference Paivio, Khan and Begg2000) have shown that the ability to generate interactive visual images facilitates verbal PAL performance, independently from the effect of stimulus relatedness. Savage et al. (Reference Savage, Saling, Davis and Berkovic2002) established that abstract pairs were less easily learned in two samples of epileptic patients (i.e., the typical concreteness effect was observed; e.g., Paivio, Reference Paivio1991), and the manipulation served here to stratify predicted level of difficulty experienced in learning the pairings.
It was hypothesized that AD patients would perform as poorly on the associate-recognition test, relative to their healthy peers (i.e., to account for differences in the baseline guessing rate), as they would on the cued-recall version establishing the sensitivity of this novel recognition paradigm to early AD. We also hypothesized that at an individual level of analysis, associate-recognition would demonstrate similar sensitivity and specificity as cued-recall. Specificity was hypothesized to be similar across the groups as the control participants were expected to perform relatively well on cued-recall PAL, as they were included in the study on the basis of performing within the average range on a free-recall memory task. If sensitivity and specificity of the tasks were similar, this would support the idea that associate-recognition may be a viable alternative to traditional cued-recall PAL. We hope this would encourage further development and research into this paradigm as it should have greater specificity in discriminating MTL from non-MTL forms of memory disorder, in principle. Finally, it was suspected that performance on the abstract stimuli in the tests might discriminate the AD group from the healthy elderly better than the concrete stimuli, because abstract stimuli are generally more difficult to encode according to the Dual Code Theory (Paivio, Reference Paivio1991).
METHOD
Participants
The final AD sample consisted of 22 patients diagnosed with probable AD according to the NINCDS-ADRDA diagnostic criteria (McKhann et al., Reference McKhann, Drachman, Folstein, Katzman, Price and Stadlan1984). Six additional patients were excluded from the study as they failed to complete the full assessment (four failed to complete the Cued-Recall PAL task and two withdrew from the study prematurely). The clinical diagnoses of AD were made by experienced psychogeriatricians (authors DA and EC). Eighty-two percent (18 of 22 of patients) had structural neuroimaging to rule out alternative causes for the patients' dementia (e.g., stroke, tumor, hydrocephalus). They were recruited from a number of hospital and private clinics in Melbourne, Australia, with approval from the human research ethics committees of the relevant institutions.
Patients had a Mini-Mental Status Examination (MMSE; Folstein et al., Reference Folstein, Folstein and McHugh1975) score equal to or above 20 and were therefore classifiable as having “mild AD” (mild impairment = MMSE score ≥20; Folstein et al., Reference Folstein, Folstein and McHugh1975). None were eligible for a diagnosis of depression according to the Montgomery-Asberg Depression Rating Scale (MADRS; Montgomery & Asberg, Reference Montgomery and Asberg1979) using a cut-off of 15 as advised by Leentjens et al. (Reference Leentjens, Verhey, Lousberg, Spitsbergen and Wilmink2000), or the Geriatric Depression Scale—short form (GDS; Burke et al., Reference Burke, Roccaforte and Wengel1991) using a cut-off of eight as advised by Yesavage et al. (Reference Yesavage, Brink, Rose, Lum, Huang, Adey and Leirer1983). All patients spoke English as their first language and had no current psychiatric illness, or history of neurological illness, cardiac arrest, cardiac surgery or head injury. Patients were not excluded based on taking cognition-enhancing medication at the time of the assessment (e.g., ACE inhibitors).
The healthy elderly (HE) comparison sample consisted of 50 participants recruited from retirement villages around Melbourne. Participants denied memory or other cognitive impairment, and scored above 25 on the MMSE. Participants were excluded if they reported current psychiatric illness or a history of neurological illness, cardiac arrest, drug abuse or dependence, or current use of medication with a detrimental cognitive effect (e.g., benzodiazepines; n = 2). Participants were also excluded if they presented with a hearing impairment or visual impairment that impacted on their ability to complete the assessment. All participants spoke English as their first language. Six participants were excluded based on performing at a level more than one standard deviation below the mean on one of the following tests: Full-Scale IQ (FSIQ) predicted from the National Adult Reading Test (NART; Nelson & Willison, Reference Nelson and Willison1991), the Hopkins Verbal Learning Test–Revised (HVLT-R; Benedict et al., Reference Benedict, Schretlen, Gronigner and Brandt1998), or the Vocabulary or Digit Span subtests of the Wechsler Adult Intelligence Scale–Third Edition (Wechsler, Reference Wechsler1997). Participants with clinically significant depressive symptoms indicated on the GDS (Burke et al., Reference Burke, Roccaforte and Wengel1991) were also excluded from the study. See Table 1 for further demographic information regarding the AD and HE samples.
** p < .001, † percentile score, ∓ scaled score.
Materials
Two stimulus lists of eight semantically/associatively unrelated word pairs were used. Presentation order of these stimulus lists was counterbalanced across Associate-Recognition (A-R) and Cued-Recall (C-R) test administrations. Four of the word-pairs in each stimulus list were highly imageable or concrete (e.g., horse-forest) and four were less imageable or abstract (e.g., open-fresh). The materials were adapted from those used by Savage et al. (Reference Savage, Saling, Davis and Berkovic2002), who investigated PAL in patients with temporal lobe epilepsy.
For the purposes of recognition testing, the first (cue) word of each pair was presented at the top of each page of a stimulus booklet to cue participants' recognition of the second (target) word in the pair. Four alternative target words (i.e., one target and three foils) were listed below the cue, and cues and targets were always within-class alternatives with respect to imageability (i.e., concrete cues were lists only with concrete target alternatives). See Fig. 1 for an example of two pages from the stimulus booklet. The battery also included background and screening cognitive measures to ensure the participants fulfilled the inclusion criteria for the study (e.g., MMSE, NART, HVLT-R, GDS, MADRS, WAIS-III Vocabulary and Digit Span subtests).
Procedure
After providing voluntary informed consent, all participants were assessed over two sessions scheduled one week apart. The A-R test was administered in the first session (to minimize potential interference from the concurrently administered HVLT-R on memory recall) and the C-R analogue was always presented in the second session. The general procedure for the administration of these tests was identical and involved reading aloud a set of standardized instructions, then providing a short practice trial, and an example of a verbal and visual strategy that could be used to enhance learning (e.g., Dunlosky & Hertzog, Reference Dunlosky and Hertzog1998).
The list of eight word-pairs was read aloud at the rate of one pair every five seconds. Immediate recognition or cued-recall was tested after each of three list presentations (each of which contained a different fixed order of items). In the recognition test phases, participants were presented with successive pages of the recognition booklet and asked to identify which of four items had been previously paired with the cue item at the head of the page. This is a novel variant of associate-recognition, which typically involves a two-choice or yes/no format within which the target and rearranged pairings are presented. Our design was constructed to emulate the sequential presentation of cued-recall PAL with the cue presented initially, followed by recall of the target second word. No time limit was set for responding to each item, but participants were encouraged to guess after five seconds. Immediate feedback was given for each item, with the target word-pair identified when errors were made. After a 30-minute filled delay the final recognition or cued-recall test was administered, without representing the stimulus list or feedback.
Data Analyses
Statistical significance was adjusted according to Bonferroni criteria (i.e., .05/number of comparisons). Data from the alternate stimulus forms, initially counterbalanced across the test formats, were combined in the main analyses. For each participant, the number of correct responses for the four concrete and the four abstract pairs were summed for each learning trial and the delay condition. Performance on the concrete and abstract stimuli were initially analyzed and presented together and then separately.
To investigate which PAL paradigm best predicted incident AD we conducted stepwise forward entry logistic binomial regression analyses using participants' Total Learning scores (sum of learning trials 1 to 3; score range = 0–24) for each paradigm. The regression was then repeated using participants' Delay scores (score range = 0–8) from each paradigm.
RESULTS
Demographic Characteristics of the Samples
From Table 1, there was no significant difference between the AD and the HE groups in terms of Age, GDS score, or Digit Span scaled score, all Fs <1, or predicted FSIQ score, F(1,70) = 2.99, p > .01, η2 = 0.04; there was small trend for HE participants to achieve higher Vocabulary scaled scores, F(1,69) = 4.18, p = .05, η2 = 0.06. As expected the AD sample performed reliably lower than the HE sample on the MMSE, F(1,70) = 83.07, p < .001, η2 = .55, HVLT-R Total Learning index, F(1,71) = 129.74, p < .01, η2 = 0.65, and the HVLT-R Delay index, F(1,71) = 148.38, p < .01, η2 = 0.68.
Effect of Test Format (A-R vs. C-R) on PAL Performance
Figure 2 presents means for the A-R and C-R PAL test formats for the AD and HE groups. The HE group performed considerably better than the AD group on both test formats, F(1,70) = 195.02, p < .01, η2 = 0.74. Both groups performed better on the A-R format than the C-R analogue, F(1,70) = 37.40, p < .01, η2 = 0.35. Importantly, no interaction was found between Test Format and Group, F(1,70) = 0.30, p > .01, η2 < 0.01, indicating that the A-R test discriminated the HE group from the AD group as effectively as the more traditional C-R recall version.
Classification Accuracy by A-R and C-R Paradigms using Logistic Regression
Stepwise logistic regression analyses were conducted to assess the diagnostic usefulness of the two tests. Firstly, the A-R Total Learning score was entered into the model alone and its classification accuracy was significant at 95.8% (3 misclassifications out of 72 participants; χ2 = 67.10, df = 1, p < .001). The C-R Total Learning index was then entered into the model and it significantly improved classification to 97.2%, reducing misclassifications to 2 participants (χ2 = 9.85, df = 1, p < .01). These indexes were then reentered into the model in the reverse sequence and the C-R Total Learning Index demonstrated a classification accuracy of 93.1% (5 misclassifications out of 72 participants; χ2 = 68.20, df = 1, p < .001). When A-R Total Learning was entered it significantly improved classification to 97.2%, reducing misclassifications to 2 (χ2 = 8.74, df = 1, p < .01). As we were conceptually interested in the relative diagnostic specificity of the tests, we compared their specificity while clamping the same sensitivity index. A sensitivity of 0.86 (i.e., 19 out of 22 AD patients) was chosen as this cut-off was easily indexed from both tests. We found that at this sensitivity index, the A-R test provided a specificity of 1.0 and the C-R test an index of 0.96 (2 misclassifications out of 50).
Regression analyses were repeated for the Delay index scores of both tests. When A-R Delay index was entered into the model its classification accuracy was significant at 88.9% (8 misclassifications out of 72; χ2 = 58.37, df = 1, p < .001). When C-R delay was entered it significantly improved classification to 95.8%, reducing misclassifications to 3 participants (χ2 = 14.39, df = 1, p < .001). When the entry sequence was reversed the C-R Delay index classification accuracy was 95.8% with 3 cases misclassified (χ2 = 68.33, df = 1, p < .001). When the A-R index was added, it significantly improved the classification accuracy but did not lead to the reclassification of any cases and hence the overall model accuracy remained at 97.2% with 2 cases misclassified (χ2 = 4.34, df = 1, p < .05).
Effect of Stimulus Imageability on PAL Performance
Data presented in Fig. 2 were reanalyzed to investigate the effect of Stimulus Imageability on PAL performance. For each participant, the number of correct responses for the four concrete and the four abstract pairs were summed separately for each Learning Trial and Delay.
From Panel A in Fig. 3, it is clear that the HE group benefited more from the imageability of the concrete stimuli than did the AD patients during the A-R test; the Stimulus Imageability by Group interaction was significant, F(1,70) = 19.48, p < .01, η2 = 0.22. There was no three-way interaction between Stimulus Imageability, Trial, and Group, F < 1. In Panel B the same pattern emerged for C-R data where the HE group benefited considerably more from the imageability of the concrete stimuli than did the AD patients, for the interaction, F(1,70) = 53.18, p < .01, η2 = .43; again, no three-way interaction was found, F < 1. Comparison across the two panels clearly indicates that the difference in performance between the AD and HE groups is much larger for the concrete pairs (presented in bold lines) than for the abstract pairs (presented in dashed lines) using both A-R and C-R tests formats.
DISCUSSION
Verbal associate-recognition is a relational memory paradigm not widely used as a tool for the detection of early AD. This study clearly demonstrates that a verbal associate-recognition paradigm, containing arbitrarily associated words, can be as effective as a cued-recall analogue for discriminating patients in the early stages of AD from healthy elderly people. This result was found both at a group and individual level of analysis. Further analysis revealed the healthy elderly sample performed exceptionally well on the concrete stimuli in both versions of the PAL task but relatively poorly (and close to the AD group's average performance) on the abstract stimuli in both test conditions. AD patients performed poorly on concrete and abstract word-pairs in both PAL test conditions. This finding suggests that a verbal associate-recognition task containing all concrete and no abstract stimuli may demonstrate even greater discriminatory power, warranting further investigation.
In the current study, the AD patients performed very poorly on the Associate-Recognition and Cued-Recall PAL tests in comparison to the healthy elderly group. Overall, their recognition performance was only marginally superior to their cued-recall performance, a difference easily attributable to the 25% chance-guessing rate in the recognition condition. Furthermore, there was no evidence of learning in the AD group with repeated exposure to the material in either of the PAL test conditions. This finding is consistent with impaired ability to encode/consolidate new information in early AD (Greene et al., Reference Greene, Baddeley and Hodges1996), even in those early AD patients taking ACE inhibitors.
For memory tasks to be clinical efficacious they must have the capacity to discriminate impaired from non-impaired people not only at a group level but also at the individual level. The discrimination accuracy of the two tests in the current study were essentially equivalent at the individual level of analysis; recognition was marginally superior to the cued-recall analogue using Total Learning scores, whereas the reverse was true when delayed recognition/recall scores were the basis for comparison. On delay, however, the range of scores for comparison was limited from 0 to 8, and the higher baseline guessing rate in the recognition test lifted average performance of all participants well above zero further reducing the range of scores. Further research may demonstrate that the inclusion of additional items in the delay trial of the associate recognition task may increase its discrimination accuracy. This could include representations of the initial cue words, with target second words flanked by alternative distractors (taken from the original stimulus list).
It is widely documented that recall measures are more sensitive to the early stages of AD than commonly-used item-recognition tasks. It has been suggested that even amnesics perform disproportionately worse on recall measures because of the fact that recognition tests are generally easier (Reed & Squire, Reference Reed and Squire1997). This finding was not replicated in the current study comparing associate-recognition and cued-recall PAL. Recall may be more dependent than recognition on frontally-mediated executive functions, such as organized search and retrieval (Bunce, Reference Bunce2003; Quamme et al., Reference Quamme, Yonelinas, Widaman, Kroll and Sauve2004). Dependence on these non-MTL retrieval processes could render cued-recall PAL less capable of discriminating patients with memory encoding impairments (i.e., early AD) from patients with memory retrieval impairments. This is clearly an area for further research as non-AD participants in the current study were not memory impaired.
The degree to which participants were required to engage in memory search and retrieval processes during the Associate-Recognition test is unclear from this study. Research has suggested that associate-recognition may represent a hybrid of processes typically involved in recall and recognition (Gronlund & Ratcliff, Reference Gronlund and Ratcliff1989; Nobel & Shiffrin, Reference Nobel and Shiffrin2001). The Associate-Recognition test supposedly requires the recollection of quite specific contextual information from the initial learning episode, in order to ascertain which initial word was paired with which second word in the stimulus list. In contrast with the Cued-Recall test, however, the content of the learning episode (i.e., the actual words) did not need to be retrieved, as the words were presented within the framework of the test. On this basis, the Associate-Recognition test would have conceivably placed less demand on memory search and retrieval processes than the cued-recall version. A reduction in the efficiency of these same memory retrieval processes is widely agreed to occur with healthy aging (Burke & Light, Reference Burke and Light1981; Howard et al., Reference Howard, Fry and Brune1991). This may explain why the Associate-Recognition test's diagnostic specificity was marginally higher than the Cued-Recall test using Total Learning scores when the sensitivities of the two tests were matched at 86%.
To identify AD in the preclinical stages of disease, where different MTL structures may be variably affected, recognition tests need to be especially sensitive to the functional integrity of the MTL as a whole. Associate-Recognition tests, involving a combination of both familiarity-based and recollective memory processes, are likely to be highly sensitive to the integrity of the MTL as a functional unit. Yonelinas (Reference Yonelinas1994, Reference Yonelinas1997, Reference Yonelinas2001) argued that cognitive, neuropsychological, and neuroimaging studies indicate that recognition judgments based on familiarity alone and those requiring some recollection of the learning event are “behaviorally, neurally, and phenomenological distinct memory retrieval processes” (Yonelinas et al., Reference Yonelinas, Hopfinger, Buonocore, Kroll and Baynes2001, p. 1363). Whereas some support for an opposing view has been found, that the hippocampus is important for supporting both familiarity and recollection (e.g., Norman & O'Reilly, Reference Norman and O'Reilly2003; Wixted & Squire, Reference Wixted and Squire2004), this conceptualization would predict that early AD patients should perform poorly on recall and familiarity-based tasks and this is not always the case (Dalla Barba, Reference Dalla Barba1997; Karlsson et al., Reference Karlsson, Johansson, Adolfsson, Nilsson and Dubuc2003; Westerberg et al., Reference Westerberg, Paller, Weintraub, Mesulam, Holdstock, Mayes and Reber2006).
Another finding of this study was the superiority of concrete or highly imageable word pairs in discriminating early AD patients from healthy elderly participants within PAL. Both concrete and abstract stimuli were initially included in the stimulus lists to stratify level of difficulty. It was suspected that abstract stimuli might be more sensitive to early AD than concrete stimuli, as abstract stimuli are generally more difficult to encode according to the Dual Code Theory (Paivio, Reference Paivio1991). However, this was not uniformly the case as the AD group performed similarly poorly on both concrete and abstract pairs. The fact that the healthy elderly group performed considerably worse on the abstract pairs than concrete pairs resulted in the two groups performing most disparately on the concrete pairs, raising doubts about the utility of including abstract stimuli in associate-learning tasks developed for use in elderly populations.
One interpretation of the lack of concreteness effect for the AD group is that like the healthy elderly, they were more successful at formulating mental images for the concrete pairs than the abstracts (despite being provided with verbal and visual strategies), but they were less able to encode or bind the images into memory because their MTL was damaged. In support of this idea, Jones (Reference Jones1974) reported that during verbal PAL tests, the amnesic patient HM who had undergone bilateral medial temporal resections was able to form and describe mental images he invoked to facilitate his memory for the word-pair associates. The images he produced on different learning trials changed, however, without him showing any indication that he had previously produced an alternative image. Jones reported that whereas HM used mental images as a mnemonic strategy, they were forgotten just as the words themselves were forgotten. Unlike HM, however, the clinically diagnosed AD patients in the current study were likely to have some loss of semantic memory in addition to episodic memory impairment (Blackwell et al., Reference Blackwell, Sahakian, Vesey, Semple, Robbins and Hodges2004; Hodges & Patterson, Reference Hodges and Patterson1995), and as a result the imageable information they may have drawn on to facilitate encoding of concrete stimuli may not have been as semantically rich as that available for the healthy elderly. Several studies have reported that verbal PAL performance by AD patients is affected to some degree by a breakdown in the structure of, or relationships within, semantic memory (Granholm & Butters, Reference Granholm and Butters1988; McWalter et al., Reference McWalter, Montaldi, Bhutani and McCrory1991; Salmon et al., Reference Salmon, Shimamura, Butters and Smith1988; Spaan et al., Reference Spaan, Raaijmakers and Jonker2005). The advantage of verbal variants of the PAL paradigm may be that they allow for the assessment of separate arbitrary and semantic aspects of memory within the one task (Elwood, Reference Elwood1997), without having to administer multiple neuropsychological tasks to capture both domains.
This study raises a variety of additional questions and provides multiple avenues for future research. It will be essential to determine whether a verbal associate-recognition task can discriminate early AD patients from those with retrieval based-memory impairment, such as patients with subcortical forms of dementia. One aim of the current study was to recruit a sample of elderly depressed patients with subjective memory impairment to investigate this issue using the verbal associate-recognition test. However, the aim was not fulfilled because of the difficulty of confidently excluding the presence of very early AD in many older adults with depression. Future research may benefit from recruiting younger people aged 40 to 60 years who are unlikely to have early AD, but who have memory impairment from conditions such as depression or Parkinson's disease, which do not primarily affect MTL functioning.
ACKNOWLEDGMENTS
For assistance with patient recruitment the authors acknowledge the support of Ms. Jennine Melville, Manager, Cognitive Dementia and Memory Service (CDAMS), Kingston Centre, Southern Health, Victoria. For assistance with statistical analyses the authors acknowledge Dr. Simon Moss, School of Psychology, Psychiatry and Psychological Medicine, Monash University, Clayton.