Introduction
International migration, with migrants defined as people who have moved across an international border, is an increasingly important phenomenon-worldwide. Due to population aging, dementia among migrants is becoming an emerging public health concern. In 2017, the number of migrants aged 65 years or older living in Europe was 6.5 million. Nearly 400.000 dementia cases were estimated in this population (Canevelli et al., Reference Canevelli, Lacorte, Cova, Zaccaria, Valletta, Raganato, Bargagli, Pomati, Pantoni and Vanacore2019).
Aiming to develop diversity-sensitive models of diagnosis and care, neuropsychologists must keep in mind the impact of culture on performances in cognitive tests. Culture represents the set of learned traditions and living styles shared by the members of a society and includes the ways of thinking, feeling, and behaving (Harris, Reference Harris1983). Ardila (Ardila, Reference Ardila2020) identified different cultural variables influencing human behavior in a neuropsychological context such as language, the quality and the degree of formal education, and the pattern of abilities and values developed as a consequence of the cultural background (such as familiarity with a one-to-one relationship, background authority of the examiner, the concept of “best performance”). These considerations led to the question of whether tests commonly used in neuropsychological assessment are free from cultural biases.
The Clock Drawing Test (CDT) is one of the most widely used cognitive tests. The subject is presented with white paper with instructions to draw a clock. In the free-drawn method, the subject is asked to draw a clock from memory. In the pre-drawn method, the request consists in drawing the numbers in the clock face and setting the hands at a fixed time. Another version requires only setting the hands at a fixed time on a pre-drawn clock complete with contour and numbers. Several different CDT scoring methods have been developed, including quantitative and qualitative systems, but no consensus exists regarding which scoring method is the most accurate (Spenciere et al., Reference Spenciere, Alves and Charchat-Fichman2017).
The CDT requires the use of many mental skills: comprehension of the request of the examiner; memory to remember the instruction to set the hands at a fixed time once the clock face is complete; executive functions to coordinate the planning, organization, and simultaneous processing (including corrections and inhibition of incorrect responses such as perseveration); visual-perceptual and visual-motor abilities to internally represent the clock, to translate the mental representation into a motor program, and to monitor the output; linguistic competence for the graphomotor representation of numbers (Freedman et al., Reference Freedman, Leach, Kaplan, Winocur, Shulman and Delis1994). Neuroanatomical regions involved in performing the CDT include both cortical (dorsolateral prefrontal cortex, frontal, and parietal lobes) and subcortical structures (thalamus, caudate, and corpus callosum) (Eknoyan et al., Reference Eknoyan, Hurley and Taber2012; Supasitthumrong et al., Reference Supasitthumrong, Herrmann, Tunvirachaisakul and Shulman2019). Due to the various cognitive functions and the underlying neuroanatomical areas, the CDT is considered a cognitive screening tool, providing a measure of the overall cognitive performance of the individual (Ehreke et al., Reference Ehreke, Luppa, König and Riedel-Heller2010; Shulman, Reference Shulman2000).
However, the use of the CDT as a screening test in a cross-cultural context is still debated (Franzen et al., Reference Franzen, van den Berg, Goudsmit, Jurgens, van de Wiel, Kalkisim, Uysal-Bozkir, Ayhan, Nielsen and Papma2020). Given the ease of administration and the limited linguistic competence required, the CDT may be deemed appropriate to support a culture fair assessment of the individual’s global cognitive functioning (Parker & Philp, Reference Parker and Philp2004). As proof of this, the CDT has been included as a subtest of the European Cross-Cultural Neuropsychological Test Battery (CNTB) (Nielsen et al., Reference Nielsen, Segers, Vanderaspoilden, Bekkhus-Wetterberg, Minthon, Pissiota, Beinhoff, Tsolaki, Gkioka and Waldemar2018) which demonstrated cross-cultural diagnostic properties for the evaluation of dementia in targeted minority and majority populations.
The present systematic review aimed to summarise the available evidence on the impact of the most commonly available and measurable cultural variables on CDT performance. Special attention was paid to the language used for the administration of the test, education (considering both the level and the quality of education), illiteracy (i.e., the absence of formal education or the inability to read and write), the level of acculturation (i.e., cultural modification of a group by adopting certain values and practices of a culture that is not originally their own) and to ethnicity (intended as any human grouping that shares common racial, cultural, and linguistic characteristics).
Methods
Study design
This systematic review was conducted and reported in accordance with the Preferred Reporting Items for Systematic reviews and Meta-Analyses for Protocols 2015 (PRISMA-P 2015) (Shamseer et al., Reference Shamseer, Moher, Clarke, Ghersi, Liberati, Petticrew, Shekelle and Stewart2015) guidelines.
Search strategy
A literature search of original articles was conducted on three comprehensive medical databases (Web of Science, PsycInfo, and PubMed) from their respective dates of inception up to March 2022.
A targeted search was based on predefined search terms and used various Boolean terms to build the various algorithms. The search identified key concept combinations which can be described as follows: (“Clock drawing test” OR “clock” OR “CDT”) AND (“cultur*” OR “educat*” OR “norm*” OR “ethnic*” OR “illiter*” OR “languag*”). These words were translated into specific search fields and syntaxes according to the different bibliographic databases (see Appendix 1 in Supplementary Material for the complete search syntax used for each electronic database).
Inclusion and exclusion criteria
The search was focused on human studies considering adults from the age of 19 years. No language restriction was used. We included only original reports which investigated the effect of at least one cultural variable on performance at CDT. Specifically, we explored the effect of: (a) language; (b) education; (c) illiteracy; (d) level of acculturation; and (e) ethnicity. Exclusion criteria were: (1) studies including minors and (2) studies that considered only the performance on the Clock Reading Test or the Copy of a Clock.
Selection of the studies
Searched results were systematically screened by three reviewers (GM, IC, and AN) for inclusion and exclusion criteria. We used a three steps screening process. Firstly, duplicates were removed both automatically and manually. Then, papers were screened for titles and abstracts. Finally, the full texts of relevant studies were searched and further assessed for eligibility criteria. In case of doubts about eligibility, the paper was reviewed by all three authors and included if two out of three were in agreement. Manual searches were extended to papers describing scoring methods of CDT cited in the selected articles to ensure that significant studies would not be missed. The study selection process is detailed in a PRISMA flow diagram (Figure 1).
Data extraction and synthesis
Data were extracted from the selected studies and reported in a dedicated database. The following information was retrieved from each study: title, authors, year of publication, geographic area of the study (defined as the one in which the patients were enrolled), patient population, sample size, scoring systems used for the CDT, and modalities of administration of CDT (free drawn, pre-drawn, or only hands setting), cultural variables potentially affecting the CDT score and type of influence of each cultural variable considered in the study (Appendix 2 in Supplementary materials).
In the papers where multiple CDT scoring systems were used, each method was considered independently as a single analysis. Therefore the total number of analyses considered is higher than the number of studies included in the review. In the case of more than one publication on the same population, the most informative paper was considered. Data were synthesized qualitatively and descriptive analyses were performed to describe the frequency of evaluation of each of the five cultural variables considered.
Results
Search results
The PRISMA flow diagram of the literature search is shown in Figure 1. Overall, 4431 papers were identified from a structured search of three databases. Seven hundred thirty-one duplicates were identified and removed. We then excluded 3411 articles by screening the titles and the abstracts. A total of 289 papers were assessed for eligibility and searched for the full-text screening, 28 of which were not available. One hundred and two papers fitted the inclusion criteria. Three additional studies were identified from citation searches on the relevant articles. A total of 105 studies were thus included in the systematic review.
In 28 studies, more than one CDT scoring system was used; since we considered each scoring method separately, we extrapolated a total of 160 analyses from the 105 studies included. The studies that were ultimately considered in the present analysis are listed in Appendix 2 in Supplementary materials.
Relevant studies were conducted in 37 different countries worldwide, mostly in the United States of America and Brazil, followed by Italy, China, and Japan. The geographical distribution of the considered studies is shown in Figure 2.
Most studies investigated the performance of the CDT in healthy subjects. Some studies enrolled outpatients referred to memory clinics whereas others compared patients with dementia and healthy controls. The remaining papers examined subjects with other neurological disorders. The minimum sample size of the relevant articles was 40 subjects and the maximum was 1873 subjects. Only 17 studies involved less than 100 subjects, whereas 31 studies involved more than 400 subjects. In 77 out of 109 analyses (70.6%), age showed a negative correlation with performance at CDT.
Due to the considerable number of scoring systems used, we performed an additional evaluation of the most used and culturally influenced scoring systems.
Scoring systems for CDT
Three studies did not specify the adopted CDT scoring method. Twenty-eight studies used more than one scoring method. Overall, 46 different CDT scoring systems were used in the selected papers; free-drawn clocks and quantitative systems were used in most of the cases. The most used scoring system was that of Shulman et al. (Shulman et al., Reference Shulman, Pushkar Gold, Cohen and Zucchero1993), followed by Sunderland et al. (Sunderland et al., Reference Sunderland, Hill, Mellow, Lawlor, Gundersheimer, Newhouse and Grafman1989) and Royall et al. colleagues (Royall et al., Reference Royall, Cordes and Polk1998) (Table 1). As shown in Table 2, none of the most frequently used scoring methods is free from the influence of the level of education.
CERAD: consortium to establish a registry for Alzheimer’s disease; Fd: free-drawn; Hs: hands setting; MoCA: Montreal cognitive assessment; n/N (%): number of analysis where the effects were found/Number of analysis where the effects were investigated (percentage of studies where the effects were found); Pd: pre-drawn.
AA: African American; AM: American; AS: Asian; ESP: Spain; HW: Hispanic White; nHW: non-Hispanic White; USA: United States of America.
* Spanish, Korean, Chinese, and Filipino dialect.
The influence of culture
An influence of at least one cultural variable on performance at CDT was found in 127 of the analyses (79.4%). Considering only the 18 studies that were conducted recruiting a multi-cultural sample, all but two found the effect of at least one cultural variable.
Language of administration
The language used for the administration of the CDT was examined in 8 studies (Table 2), mostly comparing performance between English-speaking and Spanish-speaking subjects. In three studies other languages were considered: Chinese, Korean, and Filipino dialects. The language used for test administration significantly affected the test score in only one study (LaRue et al., Reference LaRue, Romero, Ortiz, Chi Lang and Lindeman1999) and the authors suggested that it can be mediated by differences in educational and income level or rural/urban origin.
Education
Quality of education
The effect of the quality of education on performance at CDT was investigated in two studies (Hubbard et al., Reference Hubbard, Santini, Blankevoort, Volkers, Barrup, Byerly, Chaisson, Jefferson, Kaplan and Green2008; Johnson et al., Reference Johnson, Flicker and Lichtenberg2006). In both, the Wide-Range Achievement Test-3 (WRAT-3) Reading subtest (Wilkinson, Reference Wilkinson1993) was administered along with CDT. WRAT-3 is a test of word familiarity and reading ability, considered a measure of estimated premorbid intelligence and a marker of quality of education (Manly et al., Reference Manly, Jacobs, Touradji, Small and Stern2002). The authors found conflicting results. Hubbard et al. identified age and WRAT-3 reading scores as the only predictors of CDT scores assessed by Freund’s, Mendez’s, and Cahn’s global score scoring methods (Cahn, Reference Cahn1996; Freund et al., Reference Freund, Gravenstein, Ferris, Burke and Shaheen2005; Mendez et al., Reference Mendez, Ala and Underwood1992). The authors also showed that including the WRAT-3 reading scores as covariates reduces the effect of education and race on performance at the CDT; therefore, they suggested that normative scores for CDT could be based on scores at the WRAT-3 instead of on subjects’ education and race. On the contrary, Johnson and colleagues found a significant effect of WRAT-3 reading scores on several executive function tests, but not on CDT performance.
Level of education
The influence of the level of education was investigated in 154 analyses from 100 studies, documenting an influence on test performance in 118 analyses (76.6%), all revealing a positive correlation between educational level and CDT. Some authors found an effect of level of education on performance at the CDT only between subjects with a very poor education (differently defined between the studies) when compared with all the others (Lessig et al., Reference Lessig, Scanlan, Nazemi and Borson2008; Ravaglia et al., Reference Ravaglia, Forti, Maioli, Arnone, Pantieri, Cocci, Muscari, Pedone and Mariani2003; Senger et al., Reference Senger, Bruscato, Werle, Moriguchi and Pattussi2019; Shao et al., Reference Shao, Dong, Guo, Wang, Zhao, Yang and Wang2020; Wolf-Klein et al., Reference Wolf-Klein, Silverstone, Levy, Brod and Breuer1989). On the contrary, Cooke et al. (Cooke et al., Reference Cooke, Gustafsson and Tardiani2009) found that only completion of a tertiary educational level had a significant correlation with CDT performance. Sixteen analyses from 8 studies examined CDT suitability for low- or high-educated subjects (Table 3) finding hardly comparable results since authors arbitrarily chose different cut-offs (between 5 and 9 years) to distinguish between low and high education. A limited specificity or sensitivity of CDT in the assessment of low-educated and high-educated subjects was found in 10 and 4 analyses respectively. In addition, Cecato and colleagues (Cecato et al., Reference Cecato, Fiorese, Montiel, Bartholomeu and Martinelli2012) investigated the ability of CDT, assessed with different scoring methods, to differentiate patients with different levels of education and scores on the Clinical Dementia Rating (CDR) Scale (Hughes et al., Reference Hughes, Berg, Danziger, Coben and Martin1982). However, in all subjects with a high educational level (> 11 years of education), CDT scores were not able to differentiate patients with very different scores at the CDR (0 vs. 2). Using modified Shulman and Sunderland scoring methods all the subjects with more than 11 years of education obtained CDT scores above the cut-off, regardless of the score obtained at CDR. Only in subjects with less than 4 years of education, CDT was sufficiently accurate in identifying each level of CDR. Conversely, Scarabelot et al. (Scarabelot et al., Reference Scarabelot, Monteiro, Rubert and Zetola2019) showed that the use of CDT in subjects with less than 4 years of education could be impaired by the high rate of refusals to perform the test.
Note. BRA: Brazil; HI HE: higher education; KOR: Korea; LO ED: lower education; SE: sensitivity; SGP: Singapore; SP: specificity; THA: Thailand; USA: United States of America.
In bold are sensitivity and specificity ≤ 65.
Thirty-six analyses failed to find a significant effect of education on CDT performance. However, these studies involved smaller populations, and most of them considered only specific ranges of education, primarily mid range (i.e. > 10 years) (Bruce-Keller et al., Reference Bruce-Keller, Brouillette, Tudor-Locke, Foil, Gahan, Nye and Keller2012; Caffarra et al., Reference Caffarra, Gardini, Zonato, Concari, Dieci, Copelli, Freedman, Stracciari and Venneri2011; Gruber et al., Reference Gruber, Varner, Chen and Lesser1997; Hill et al., Reference Hill, Bäckman, Wahlin and Winblad1995; Lowery et al., Reference Lowery, Giovanni, Mozley, Arnold, Bilker, Gur and Moberg2003; Royall et al., Reference Royall, Mulroy, Chiodo and Polk1999; Yamamoto et al., Reference Yamamoto, Mogi, Umegaki, Suzuki, Ando, Shimokata and Iguchi2004) or low range (i.e. < 6 years) (Alegret et al., Reference Alegret, Espinosa, Vinyes-Junqué, Valero, Hernández, Tárraga and Boada2012; Chan et al., Reference Chan, Yung and Pan2005; Marcopulos et al., Reference Marcopulos, Gripshover, Broshek, McLain and McLain1999; Storey et al., Reference Storey, Rowland, Basic and Conforti2002).
Illiteracy
In 10 analyses from 8 studies, illiterate subjects were involved. Three studies defined illiterate as those subjects who never attended school or attended school for less than 1 year; in the other four studies, illiterate subjects were the ones who considered themselves unable to read and/or write, for example using the Literacy Questionnaire interview (Moon & Chey, Reference Moon and Chey2004), or unable to respond to the “close your eyes” and “write a sentence” items of Mini-Mental State Examination (Folstein et al., Reference Folstein, Folstein and McHugh1975). In three studies, to be considered illiterate, subjects had to adhere to both the previous definitions. All studies but one showed a negative influence of illiteracy on CDT performance, however, the latter (Cassimiro et al., Reference Cassimiro, Fuentes, Nitrini and Yassuda2016) involved only subjects with less than 4 years of education and found better performance in subjects with 3–4 years of education compared to subjects with less than 3 years.
Acculturation
Only two studies investigated the effect of acculturation on performance at the CDT (Nielsen & Jørgensen, Reference Nielsen and Jørgensen2013; Royall et al., Reference Royall, Espino, Polk, Verdeja, Vale, Gonzales, Palmer and Markides2003). Authors defined acculturation as a multidimensional process whereby members of one cultural group adopt the attitudes, values, and behaviors of another (Gordon, Reference Gordon1964). Acculturation was assessed with the Hazuda scale (Hazuda et al., Reference Hazuda, Stern and Haffner1988), investigating both English proficiency and the pattern of English versus Spanish usage, or the Turkish adaptation of the Short Acculturation Scale for Hispanics (SASH) (Marin et al., Reference Marin, Sabogal, Marin, Otero-Sabogal and Perez-Stable1987; Nielsen et al., Reference Nielsen, Vogel, Gade and Waldemar2012). Authors obtained conflicting results: Royall et al. (Royall et al., Reference Royall, Espino, Polk, Verdeja, Vale, Gonzales, Palmer and Markides2003) identified a significant but small effect of acculturation on performance at CDT (all the sociodemographic variables combined explained the 8% of the CDT variance; p < 0.001). On the contrary, Nielsen and Jørgensen did not find a significant correlation between the CDT performance of Turkish migrants with years of residence in Denmark and SASH score in both literate and illiterate subjects.
Ethnicity
Ten studies investigated the effect of ethnicity on CDT scores, and most of them were conducted in the USA. In all but two studies, authors operationalized ethnicity as different races, comparing Caucasians with other races. Better performance of Caucasians was found in half of the cases (Table 4). Nielsen et al. considered ethnicity as the country of origin and found that migrant minorities (Polish, Yugoslavian, Turkish, and Moroccan) display lower scores than Western European majorities (Belgian, Danish, German, Greek, Norwegian, and Swedish) (Nielsen et al., Reference Nielsen, Segers, Vanderaspoilden, Bekkhus-Wetterberg, Minthon, Pissiota, Beinhoff, Tsolaki, Gkioka and Waldemar2018). Some possible factors underlying the association between ethnicity and CDT score were identified in different papers, mainly age, level and quality of education, and degree of acculturation, nevertheless ethnicity maintained an influence on test scores even when controlling for these variables.
AA: African American; BEL: Belgium; Ch: Chinese; DEU: Germany; DNK; Denmark; GRC: Greece; HW: Hispanic white; nChAs: non-Chinese Asian; nHW: non-Hispanic white; NOR: Norway; SWE: Sweden; USA: United States of America; WRAT-3: wide-range achievement test-3; WE: Western Europeans.
Two studies examined the accuracy of different scoring methods in detecting dementia in a multicultural population (Borson et al., Reference Borson, Brush, Gil, Scanlan, Vitaliano, Chen, Sta Maria, Barnhart and Roques1999; Storey et al., Reference Storey, Rowland, Basic and Conforti2002). Only two scoring systems turned out to be sufficiently accurate in the target population, even if with conflicting results (Table 5).
AA: African American; AF: African; AM: American; As: Asian; AUS: Australia; EU: European; HW: Hispanic white; nHW: non-Hispanic white; SA: South American; USA: United States of America.
* When a scoring system is defined as “suitable” it means that the author of the relative paper identified it as sufficiently accurate for the investigated multicultural population.
Discussion
The present study represents the first attempt to systematically present and discuss the available evidence on the influence of culture on the performance at the CDT. An influence of the considered cultural variables was found in most studies, in particular in three-quarters of the studies regarding the level of education (and almost all those regarding literacy) and in half of the studies regarding ethnicity and acculturation. Conversely, the language of administration of CDT seemed to have a negligible effect.
Most of the studies included in this review have been conducted in America and Europe, few studies have been conducted in Asia (mainly China and Japan) and Africa. Few studies recruited a multicultural sample. Sample sizes were very heterogeneous and several systems have been used to score CDT, thus limiting the validity of comparisons. None of the most used scoring systems showed to be free from the influence of cultural variables.
Only a few studies investigated the influence of language in which CDT is administered, most of which found no significant effect. It is not surprising since the CDT requires limited linguistic competence.
When investigating the effect of education on CDT performance, it is necessary to consider the quality of education as a possible confounding variable. Attending the same number of years of school does not mean having the same education in qualitative terms. The Reading Recognition subtest from the WRAT–3 (Wilkinson, Reference Wilkinson1993) can be used as a measure of reading ability and quality of education (Manly et al., Reference Manly, Jacobs, Touradji, Small and Stern2002). However, the relationship between quality of education and scores at CDT is still poorly investigated, and the results are mixed. Many studies focused on the effect of educational level on CDT performance, identifying a positive correlation: as the level of education increases, performance on the test improves. We hypothesize, following Ardila (Ardila, Reference Ardila2020), that two factors could mediate the above relationship. First, the concept of familiarity: the subjects with a higher educational level may be more familiar not only with the material administered but also with the drawing activity and paper and pencil assignments; they may also be more accustomed to assessment contexts, including the one-to-one relationship, the background authority of the examiner and the concept of best performance. Secondly, the relationship between the level of education and CDT performance might be mediated by cognitive reserve, defined as the “adaptability (i.e., efficiency, capacity, flexibility) of cognitive processes that help to explain differential susceptibility of cognitive abilities or day-to-day function to brain aging, pathology, or insult” (Stern et al., Reference Stern, Arenaza-Urquijo, Bartrés-Faz, Belleville, Cantilon, Chetelat, Ewers, Franzmeier, Kempermann, Kremen, Okonkwo, Scarmeas, Soldan, Udeh-Momoh, Venezuela, Vemuri and Vuoksimaa2020). The authors suggest that differences in cognitive results are determined by processes influenced by not only innate differences but also lifetime exposure, including education, occupation, and social engagement. Different studies showed that higher levels of education are associated with a lower risk of dementia (Evans et al., Reference Evans, Hebert, Beckett, Scherr, Albert, Chown, Pilgrim and Taylor1997; Karp, Reference Karp2004; Stern et al., Reference Stern, Gurland, Tatemichi, Tang, Wilder and Mayeux1994). Therefore, it is possible that a higher level of education, contributing to increasing the level of cognitive reserve, induces an improvement in CDT performance. Results are mixed when investigating the effect of education on the test in specific education cohorts. Some studies have shown low accuracy of CDT in subjects with low levels of education, while others have identified a low specificity or sensitivity of the test in subjects with high levels of education. CDT may not be suitable for detecting cognitive impairment in low educated and illiterate subjects because they may be excessively disadvantaged by the limited training of the skills useful to perform the test; in addition, they may suffer from unfamiliarity with the task and the assessment setting. On the contrary, in highly educated subjects the greater cognitive reserve and the increased familiarity with the task could make the test too easy, leading to overestimating their cognitive abilities. The high heterogeneity of enrolled cohorts and different cut-offs used to identify different levels of education contributed to the difficulty in interpreting these results. A small percentage of studies identified no correlation between educational level and performance on the CDT, but most of these studies recruited smaller cohorts.
All the analyses except one found a significant effect of illiteracy on CDT performance, both when illiteracy is defined as the absence of formal education and when defined as the inability to read and write. The authors explained the effect of illiteracy as a consequence of poor development of constructional skills and planning, organization, simultaneous processing, and self-monitoring, all directly or indirectly trained in school (Kim & Chey, Reference Kim and Chey2010; Mokri et al., Reference Mokri, Ávila-Funes, Le Goff, Ruiz-Arregui, Gutierrez Robledo and Amieva2012; Nielsen & Jørgensen, Reference Nielsen and Jørgensen2013). Kim et al. showed that illiterate older people made errors similar to those of the Alzheimer’s dementia patients, specifically conceptual errors. It is noteworthy that no one of the studies investigating the effect of illiteracy on the performance at the CDT was conducted in the USA (where most of the studies included in the present review were conducted), suggesting the possibility of an underestimation of this issue.
The influence of acculturation on performance at the CDT has been investigated in few studies, identifying at most a small effect. One possible explanation can be found in the characteristics of questionnaires designed to assess the level of acculturation. Both of them investigated acculturation levels comparing in different contexts the frequency of speaking in the native language versus in the language spoken in the host nation. Given the absence of the effect of language on test performance, it is not surprising that acculturation as measured in this way also shows no significant effect. However, acculturation is a multidimensional process, before ruling out an influence on the test further studies would be desirable that would investigate the construct from a different perspective.
In the present review, we decided to include all papers investigating the effect of “ethnicity” intended as a broad category. This would have allowed us to consider all the studies that subdivided subjects based on any ethnic characteristic such as race, culture of reference, country of origin, and language spoken; nonetheless, most studies defined ethnicity as race. Since most of these studies were conducted in the USA, the reason for this categorization is related to the prevalence of a long-standing migration in the USA. Given that many migrants are second or third-generation migrants, it should be difficult to categorize them based on their country of origin. We found that the majority of the scoring systems developed to evaluate CDT performance seem to be inaccurate in detecting dementia in a multicultural population. Moreover, half of the studies found a better performance of the Caucasian population when compared with other races. These results can be explained partly by the mediating effect of the quality of education. Manly et al. (Manly et al., Reference Manly, Jacobs, Touradji, Small and Stern2002) suggest that in the USA there is a great deal of discordance in the quality of education between Caucasian and African American subjects, and Avila et al. (Avila et al., Reference Avila, Rentería, Jones, Vonk, Turney, Sol, Arias, Hill-Jarrett, Levy, Meyer, Racine, Tom, Melrose, Deters, Medina, Carriòn, Diaz-Santos, Byrd, Chesebro, Colon, Igwe, Maas, Brickman, Schupf, Mayeux and Manly2021), comparing the contribution of the level of education to cognitive reserve in Whites, Blacks, and Hispanics, found that educational attainment does not contribute to cognitive reserve similarly across different racial groups. Through the effect on cognitive reserve, the differences in the quality of education could explain the residual effect of ethnicity on performance at the CDT even when controlling for the level of education. In addition, it is well known that most of the neuropsychological tools are designed in an occidental context explicitly for a WEIRD Population (Henrich et al., Reference Henrich, Heine and Norenzayan2010) (that is Western, Educated, Industrialized, Rich, and Democratic), so it should not be surprising that patients belonging to the same cultural group of the test developer usually obtain better results.
This study has some limitations. Firstly, we did not use specific scales to assess the quality of the included studies. Secondly, we included all the studies in which the effect of cultural variables on CDT performance was considered, regardless of their sample size and their sensitivity and specificity. Thirdly, the heterogeneity of definitions used (such as the definition of “low” and “high” education, ethnicity, etc.) and the several different CDT scoring methods used across studies do not allow us to perform a meta-analysis. Finally, we were not able to find 21 papers. However, we would highlight the strengths of our study too. We have considered several cultural variables which can affect CDT performance and we have taken into account also different scoring systems. Moreover, we were able to include all the studies with no limitation of languages of publication.
In future studies, we suggest better investigating the role of quality of education and the level of acculturation on CDT performance, especially as mediating factors of ethnicity and level of education. Also, the suitability of the test for illiterate or low- high-education level subjects should be better studied to avoid an overestimation or an underestimation of cognitive impairment in these populations.
Conclusion
Based on these findings, CDT does not seem to provide a culturally unbiased assessment of global cognition. These results suggest caution when using neuropsychological tests in a multicultural context, even when limited linguistic competence is required.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S1355617722000662
Acknowledgements
Alessia Nicotra, Giorgia Maestri, and Marco Canevelli are supported by a research grant from the Italian Ministry of Health for the project “Dementia in immigrants and ethnic minorities living in Italy: clinical-epidemiological aspects and public health perspectives” (ImmiDem) (GR-2016-02364975). Author contributions Conceived and designed the study: SP, GM, AN. Collected the data: IC, GM, AN. Contributed data or analysis tools: GM, AN, SP, IC Performed the analysis: GM, AN, SP, IC. Writing—original draft: IC, GM, AN, MC. Writing—review and editing: all authors discussed the results and contributed to the final manuscript. Conflict of interest The authors declare no competing interests. Funding The authors received no specifc funding for this work.