Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-01-13T23:41:59.209Z Has data issue: false hasContentIssue false

Assessment of Prorated Scoring of an Abbreviated Protocol for the National Institutes of Health Toolbox Cognition Battery

Published online by Cambridge University Press:  21 October 2020

Alexander D. Rebchuk
Affiliation:
Division of Neurosurgery, University of British Columbia, Vancouver, Canada
Arshia Alimohammadi
Affiliation:
Faculty of Medicine, University of British Columbia, Vancouver, Canada
Michelle Yuan
Affiliation:
Undergraduate Student, Faculty of Science, University of British Columbia, Vancouver, Canada Vancouver Stroke Program, Vancouver, BC, Canada
Molly Cairncross
Affiliation:
Division of Physical Medicine & Rehabilitation, University of British Columbia, Vancouver, Canada Rehabilitation Research Program, Vancouver Coastal Health Research Institute, Vancouver, Canada
Ivan J. Torres
Affiliation:
Department of Psychiatry, University of British Columbia, Vancouver, Canada British Columbia Mental Health and Substance Use Services, Vancouver, Canada
Noah D. Silverberg
Affiliation:
Division of Physical Medicine & Rehabilitation, University of British Columbia, Vancouver, Canada Rehabilitation Research Program, Vancouver Coastal Health Research Institute, Vancouver, Canada
Thalia S. Field*
Affiliation:
Faculty of Medicine, University of British Columbia, Vancouver, Canada Vancouver Stroke Program, Vancouver, BC, Canada Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Vancouver, Canada
*
*Correspondence and reprint requests to: Thalia Field, MD FRCPC MHSc, Associate Professor, University of British Columbia, Stroke Neurologist, Vancouver Stroke Program, S169-2211 Wesbrook Mall, VancouverBC V6T 2B5. E-mail: thalia.field@ubc.ca
Rights & Permissions [Opens in a new window]

Abstract

Objective:

To evaluate an abbreviated NIH Toolbox Cognition Battery (NIHTB-CB) protocol that can be administered remotely without any in-person assessments, and explore the agreement between prorated scores from the abbreviated protocol and standard scores from the full protocol.

Methods:

Participant-level age-corrected NIHTB-CB data were extracted from six studies in individuals with a history of stroke, mild traumatic brain injury (mTBI), treatment-resistant psychosis, and healthy controls, with testing administered under standard conditions. Prorated fluid and total cognition scores were estimated using regression equations that excluded the three fluid cognition NIHTB-CB instruments which cannot be administered remotely. Paired t tests and intraclass correlations (ICCs) were used to compare the standard and prorated scores.

Results:

Data were available for 245 participants. For fluid cognition, overall prorated scores were higher than standard scores (mean difference = +4.5, SD = 14.3; p < 0.001; ICC = 0.86). For total cognition, overall prorated scores were higher than standard scores (mean difference = +2.7, SD = 8.3; p < 0.001; ICC = 0.88). These differences were significant in the stroke and mTBI groups, but not in the healthy control or psychosis groups.

Conclusions:

Prorated scores from an abbreviated NIHTB-CB protocol are not a valid replacement for the scores from the standard protocol. Alternative approaches to administering the full protocol, or corrections to scoring of the abbreviated protocol, require further study and validation.

Type
Brief Communication
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © INS. Published by Cambridge University Press, 2020

Introduction

Cognition is an important outcome in research trials and clinical practice (McInnes et al., Reference McInnes, Friesen, MacKenzie, Westwood and Boe2017; Sheffield et al., Reference Sheffield, Karcher and Barch2018; Tang et al., Reference Tang, Amiesimaka, Harrison, Green, Price, Robinson and Stephan2018). To provide a common metric of cognition in the context of clinical research, the NIH Toolbox Cognition Battery (NIHTB-CB) was introduced. It is a brief, tablet-based cognitive assessment that has been validated for use in healthy populations and those with neurological and psychiatric disease (Carlozzi, Goodnight et al., Reference Carlozzi, Goodnight, Casaletto, Goldsmith, Heaton, Wong and Tulsky2017; Carlozzi, Tulsky et al., Reference Carlozzi, Tulsky, Wolf, Goodnight, Heaton, Casaletto and Heinemann2017; Weintraub et al., Reference Weintraub, Dikmen, Heaton, Tulsky, Zelazo, Slotkin and Gershon2014).

The NIHTB-CB is comprised of seven instruments: two assessing crystallized cognition (Picture Vocabulary and Oral Reading Recognition) and five assessing fluid cognition (Flanker Inhibitory Control and Attention, List Sorting Working Memory, Dimensional Change Card Sort, Pattern Comparison Processing Speed, and Picture Sequence Memory) (Weintraub et al., Reference Weintraub, Dikmen, Heaton, Tulsky, Zelazo, Bauer and Gershon2013). Of these, both instruments that assess crystallized cognition and two that assess fluid cognition (List Sorting Working Memory and Picture Sequence Memory) can be modified for administration without any physical contact between examinee and tablet. The other three fluid cognition instruments are scored based on accuracy and reaction time, and thus, require in-person inputs into the tablet.

In the context of the COVID-19 pandemic where strict physical distancing guidelines have been implemented, there is a strong need for remote cognitive assessments (Gostin & Wiley, Reference Gostin and Wiley2020). Our group has previously developed and validated a protocol for administering the NIHTB-CB using telemedicine to assess participants at remote sites (Rebchuk et al., Reference Rebchuk, Deptuck, O’Neill, Fawcett, Silverberg and Field2019). However, this protocol still requires in-person conditions for some instruments. Recent guidelines published by the NIHTB-CB developers describe an abbreviated protocol, incorporating only the four instruments that can be administered entirely remotely (HealthMeasures Help Desk, 2020a).

We sought to explore whether a prorated score based on this abbreviated battery could provide a valid substitute for the standard score from the full battery. We assessed the agreement between prorated fluid and total cognition scores from the abbreviated protocol versus standard scores from the full protocol. The equations we applied to estimate prorated scores were derived from published regression equations for NIHTB-CB standard scores (Casaletto et al., Reference Casaletto, Umlauf, Beaumont, Gershon, Slotkin and Heaton2015; HealthMeasures Help Desk, 2020b). As much ongoing research has been modified to facilitate physical distancing, this work helps to inform the future interpretation of data collected with the abbreviated NIHTB-CB protocol.

Methods

Data

We extracted participant-level NIHTB-CB data gathered under standard conditions by trained examiners as part of six previous or ongoing studies in individuals with neurological disease [history of stroke or mild traumatic brain injury (mTBI)] or psychosis (inpatients with treatment-resistant psychosis) and healthy controls (no history of neurological disease, learning disability, or active psychosis). See Supplementary material for details of respective studies.

For all data sets, the NIHTB-CB was administered on an iPad (Apple, California, USA), and Form A of the cognition battery was used. Participant demographic data were captured with written questionnaires. All participants were older than 18 years and provided written informed consent. The experimental protocols for the respective studies were approved previously by the University of British Columbia’s Clinical Research Ethics Board, and conformed to the Declaration of Helsinki.

Statistical Analysis

We chose to report standard scores corrected for age (mean = 100, standard deviations = 15) and not other demographic variables because education levels may not be equivalent across regions where our data were collected (Vancouver, Canada) and where the NIHTB-CB was normed (United States) (Chevalier et al., Reference Chevalier, Stewart, Nelson, McInerney and Brodie2016). As well, several of our participants identified with race(s) that the NIHTB-CB race/ethnicity options failed to capture.

Prorated fluid (Equation 1) and total (Equation 2) cognition scores were derived from appropriate regression equations provided with the NIHTB-CB (Casaletto et al., Reference Casaletto, Umlauf, Beaumont, Gershon, Slotkin and Heaton2015; HealthMeasures Help Desk, 2020b). The prorated fluid cognition score included instruments (List Sorting Working Memory and Picture Sequence Memory) that can be administered remotely without the examinee having direct access to the tablet.

  1. (1) Prorated Fluid Cognition = 100 + 15 * [((Mean of List Sort & Pic Seq Mem Age-corrected Scores) – 100.15)/10.10]

  2. (2) Prorated Total Cognition = 100 + 15 * [((Mean of Age-corrected Prorated Fluid Composite Score & Crystallized Composite Scores) – 100.02)/12.93]

Data were separated into healthy controls and disease-specific groups (stroke, mTBI, and psychosis). Demographic data between groups were compared using one-way analysis of variance for parametric data and chi-square test for categorical data.

Paired t tests were used to compare the standard and prorated fluid cognition score within each group; data met assumptions of normality (Meyers et al., Reference Meyers, Zellinger, Kockler, Wagner and Miller2013). Prediction error was determined for the difference between standard and prorated scores for each participant, as well as mean prediction error for each group. Intraclass correlation (ICC) values between standard and prorated fluid cognition group-level scores were generated using two-way mixed effects, absolute agreement, and multiple measurements model (Koo & Li, Reference Koo and Li2016). Data met assumptions of normality and equality of variance for ICC analyses. All analyses were repeated for the prorated total cognition scores. We operationalized a clinically meaningful discrepancy as 0.5 standard deviations (or 7.5 standard score points), and calculated the frequency of participants with prorated–standard discrepancies exceeding this magnitude (Silverberg & Millis, Reference Silverberg and Millis2009). A prediction error of zero reflects equal standard and prorated scores. Chi-square tests were used to compare observed frequencies of participants with clinically significant prediction errors (i.e., exceeding ±0.5 SD difference between total and prorated score) between groups. Data met the assumptions of chi-square testing.

Given the exploratory nature of the study, we did not correct for multiple comparisons. Significance was set a priori at 0.05. Statistical analyses were performed using IBM SPSS Statistics (Version 19.0; IBM Corp., Armonk, NY).

Results

Data were available for 245 participants: 77 (31.4%) healthy controls, 66 (26.7%) individuals with mTBI, 63 (25.7%) with a history of stroke, and 39 (15.9%) with active psychosis. Almost half (48.6%) were female, mean age was 41.8 years (SD = 11.9), and mean duration of education was 14.9 years (SD = 2.6). Subgroup characteristics are shown in Table A1 in the Appendix.

Group-level Comparisons

Overall, fluid cognition prorated scores were higher than standard fluid cognition scores (mean difference +4.5, SD = 14.3; p < 0.001). These differences were significant in the stroke and mTBI groups, but not in the healthy or psychosis groups. This resulted in overall prorated scores for total cognition also being higher than standard total cognition scores (mean difference +2.7, SD = 8.3; p < 0.001). Again, these differences were only significant in the stroke and mTBI groups (see Table 1). Overall agreement between prorated and total scores as per the ICC was moderate-to-good for fluid cognition only, and good-to-excellent for total cognition.

Table 1. Standard and prorated age-corrected standard scores (mean, SD) for fluid cognition and total cognition in healthy participants and those with stroke, psychosis, and mTBI. ICCs (95% CIs) between standard and prorated scores are given

Individual-level Comparisons

Clinically significant fluid cognition prediction errors (greater than ±0.5 SD difference between total and prorated scores) were present in 62.9% of participants; 42.9% were overestimated and 20.0% were underestimated. For total cognition, 40.4% of participants had a prediction error; 28.6% were overestimated and 11.8% were underestimated (Figure 1). The psychosis group had the lowest percentage (59.0%) of fluid prediction errors, followed by healthy control (60.0%), mTBI (65.2%), and stroke (69.8%) groups. These numerical differences did not meet statistical significance (p = 0.425). For total cognition, healthy controls had the lowest percentage (33.8%) of prediction errors greater than ±0.5 SD, followed by those with psychosis (35.9%), mTBI (42.4%), and stroke (49.2%). Again, these differences were not statistically significant (p = 0.275).

Fig. 1. Bland–Altman plot for fluid cognition (top row) and total cognition (bottom row) prediction errors for healthy controls, stroke, psychosis, and mTBI groups, including mean group difference (blue dotted line). Participant level data are represented by circles. Threshold for acceptable prediction error was set ±0.5 SD (red lines) from zero (green dotted line). A prediction error of zero indicates equal standard and prorated scores.

Discussion

The aim of this exploratory study was to assess the validity of a prorated score, based on a proposed abbreviated NIHTB-CB protocol, against the standard score for the usual protocol (HealthMeasures Help Desk, 2020a). Particularly during COVID-19-related physical distancing measures, the potential advantage of an abbreviated protocol is its ability for remote administration without personnel alongside the examinee. Beyond the COVID-19 pandemic, advantages of a fully remote protocol could include greater participation by those with mobility restrictions or in isolated communities, and fewer losses to follow-up (Berge et al., Reference Berge, Stapf, Al-Shahi Salman, Ford, Sandercock and van der Worp2016).

Overall, we found that prorated scoring for the abbreviated protocol overestimated fluid and total cognition standard scores. However, differences were noted between testing groups, with no group-level differences seen between prorated and standard scores in healthy individuals or in those with treatment-resistant psychosis.

It is uncertain as to whether these significant differences in group-level performance represent true differences related to domain-specific deficits from lesional injuries in the stroke or mTBI participant groups, random error, or insufficient statistical power to detect between-group differences in the healthy control group or, in particular, the psychosis group, which has the fewest participants (McInnes et al., Reference McInnes, Friesen, MacKenzie, Westwood and Boe2017; Nys et al., Reference Nys, van Zandvoort, de Kort, Jansen, de Haan and Kappelle2007; O’Brien et al., Reference O’Brien, Erkinjuntti, Reisberg, Roman, Sawada, Pantoni and DeKosky2003). The instruments included within our prorated scores include measures of working memory and episodic memory, and fail to capture processing speed, attention, and executive function (Mungas et al., Reference Mungas, Heaton, Tulsky, Zelazo, Slotkin, Blitz and Gershon2014). It may be that anatomic lesions or functional deficits (e.g., frontal lobe injury, motor deficits, and fatigue) in the stroke and mTBI cohorts result in worse performance in executive function and timed tasks, in particular, and hence lead to the overestimation of prorated scores with exclusion of instruments assessing these specific domains. The data were collected as part of six separate studies, and unmeasured confounders specific to study conditions may also play a role.

Although exclusion of processing speed, attention, and executive function tests from prorated scores failed to significantly affect the assessment of healthy controls and psychosis cohorts at the group level, we cannot confidently conclude that prorated scores are equivalent to standard scores in these groups. Amongst healthy controls, 60.0% of prorated scores were overestimated or underestimated by a clinically significant margin, and amongst psychosis patients, the rate was 59.0%. Given the significant variability in patient-level performance, these two methods should not be considered equivalent when considering individual-level data.

Our study has limitations. Our findings are limited to healthy individuals and those with stroke, mTBI, or treatment-resistant psychosis. Future studies should explore whether there may be groups in which an abbreviated protocol may be appropriate. Additionally, we only reported age-corrected scores, which do not control for sex, education, and ethnicity of participants; these factors may influence NIHTB-CB performance (Casaletto et al., Reference Casaletto, Umlauf, Beaumont, Gershon, Slotkin and Heaton2015).

At this point in time, we are simply comparing in-person testing with prorated versus standard scoring in advance of considering entirely remote adaptations of the NIHTB-CB protocol. We have not prospectively validated an abbreviated remote protocol as we are limited by current physical distancing recommendations related to the COVID-19 pandemic.

In conclusion, an abbreviated NIHTB-CB protocol is a pragmatic solution in the context of physical distancing requirements, but does not constitute a valid replacement for the standard protocol. Our preliminary findings suggest that prorated scores excluding the Flanker Inhibitory Control and Attention, Dimensional Change Card Sort, and Pattern Comparison Processing Speed instruments may tend to overestimate Fluid Composite scores. Thus, a fully remote version of the NIHTB-CB should include adapted versions of the timed instruments. We provide empirical evidence in support of newly updated guidelines by the NIHTB developers, which now state that prorated scores may not be comparable to standard scores (Salesforce, 2020). Still, remote administration of the current abbreviated protocol warrants further validation of the nontimed instruments. These individual instruments, administered remotely, may still benefit continuity of research measuring crystallized cognition and working and episodic memory.

Acknowledgements

The authors thank Leah Kuzmuk, Halina Deptuck, Zoe O’Neill, Hadley Pearce, Tasha Klotz, and Hiresh Gindwani for their assistance in data collection. This article was discussed with the Department of Medical Social Sciences at Northwestern University, which governs the scientific activity of NIH Toolbox, prior to submission. At this time, neither the authors’ group nor theirs would recommend an abbreviated NIHTB-CB protocol with prorated scoring as a replacement for the standard protocol.

Financial support

NDS reports salary support from the Michael Smith Foundation for Health Research. TSF is supported by a Heart and Stroke Foundation of Canada National New Investigator Award, a Michael Smith Health Professional Investigator Award, and a Vancouver Coastal Health Research Institute Clinician-Scientist Award.

Conflict of interest

IJT has received consulting fees or sat on advisory boards for Lundbeck Canada, Sumitomo Dainippon, and Community Living British Columbia (CLBC). TSF receives study medication from Bayer Canada. The other authors report no relevant conflicts.

Supplementary material

To view supplementary material for this article, please visit https://doi.org/10.1017/S1355617720001010

References

REFERENCES

Berge, E., Stapf, C., Al-Shahi Salman, R., Ford, G.A., Sandercock, P., van der Worp, H.B., … ESO Trials Network Committee (2016). Methods to improve patient recruitment and retention in stroke trials. International Journal of Stroke: Official Journal of the International Stroke Society, 11(6), 663676.CrossRefGoogle ScholarPubMed
Carlozzi, N.E., Goodnight, S., Casaletto, K.B., Goldsmith, A., Heaton, R.K., Wong, A.W.K., … Tulsky, D.S. (2017). Validation of the NIH toolbox in individuals with neurologic disorders. Archives of Clinical Neuropsychology: The Official Journal of the National Academy of Neuropsychologists, 32(5), 555573.CrossRefGoogle ScholarPubMed
Carlozzi, N.E., Tulsky, D.S., Wolf, T.J., Goodnight, S., Heaton, R.K., Casaletto, K.B., … Heinemann, A.W. (2017). Construct validity of the NIH toolbox cognition battery in individuals with stroke. Rehabilitation Psychology, 62(4), 443454.CrossRefGoogle ScholarPubMed
Casaletto, K.B., Umlauf, A., Beaumont, J., Gershon, R., Slotkin, J., … Heaton, R.K. (2015). Demographically corrected normative standards for the English version of the NIH toolbox cognition battery. Journal of the International Neuropsychological Society: JINS, 21(5), 378391.CrossRefGoogle ScholarPubMed
Chevalier, T.M., Stewart, G., Nelson, M., McInerney, R.J., & Brodie, N. (2016). Impaired or not impaired, that is the question: navigating the challenges associated with using Canadian normative data in a comprehensive test battery that contains American tests. Archives of Clinical Neuropsychology: The Official Journal of the National Academy of Neuropsychologists, 31(5), 446455.CrossRefGoogle ScholarPubMed
Gostin, L.O., & Wiley, L.F. (2020). Governmental public health powers during the COVID-19 pandemic: stay-at-home orders, business closures, and travel restrictions. JAMA: The Journal of the American Medical Association, 323(21), 21372138. https://doi.org/10.1001/jama.2020.5460 CrossRefGoogle ScholarPubMed
HealthMeasures Help Desk. (2020a). Retrieved May 5, 2020, from https://nihtoolbox.force.com/s/article/Coronavirus-COVID-19 Google Scholar
HealthMeasures Help Desk. (2020b). Retrieved May 2, 2020, from https://nihtoolbox.force.com/s/article/calculating-scores Google Scholar
Koo, T.K., & Li, M.Y. (2016). A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. Journal of Chiropractic Medicine, 15(2), 155163. https://doi.org/10.1016/j.jcm.2016.02.012 CrossRefGoogle ScholarPubMed
McInnes, K., Friesen, C.L., MacKenzie, D.E., Westwood, D.A., & Boe, S.G. (2017). Mild Traumatic Brain Injury (mTBI) and chronic cognitive impairment: a scoping review. PLoS One, 12(4), e0174847. https://doi.org/10.1371/journal.pone.0174847 CrossRefGoogle ScholarPubMed
Meyers, J.E., Zellinger, M.M., Kockler, T., Wagner, M., & Miller, R.M. (2013). A validated seven-subtest short form for the WAIS-IV. Applied Neuropsychology. Adult, 20(4), 249256.CrossRefGoogle ScholarPubMed
Mungas, D., Heaton, R., Tulsky, D., Zelazo, P.D., Slotkin, J., Blitz, D., … Gershon, R. (2014). Factor structure, convergent validity, and discriminant validity of the NIH Toolbox Cognitive Health Battery (NIHTB-CHB) in adults. Journal of the International Neuropsychological Society: JINS, 20(6), 579587.CrossRefGoogle ScholarPubMed
Nys, G.M.S., van Zandvoort, M.J.E., de Kort, P.L.M., Jansen, B.P.W., de Haan, E.H.F., & Kappelle, L.J. (2007). Cognitive disorders in acute stroke: prevalence and clinical determinants. Cerebrovascular Diseases, 23(5–6), 408416.CrossRefGoogle ScholarPubMed
O’Brien, J.T., Erkinjuntti, T., Reisberg, B., Roman, G., Sawada, T., Pantoni, L., … DeKosky, S.T. (2003). Vascular cognitive impairment. Lancet Neurology, 2(2), 8998.CrossRefGoogle ScholarPubMed
Rebchuk, A.D., Deptuck, H.M., O’Neill, Z.R., Fawcett, D.S., Silverberg, N.D., & Field, T.S. (2019). Validation of a novel telehealth administration protocol for the NIH toolbox-cognition battery. Telemedicine Journal and E-Health: The Official Journal of the American Telemedicine Association, 25(3), 237242.CrossRefGoogle ScholarPubMed
Sheffield, J.M., Karcher, N.R., & Barch, D.M. (2018). Cognitive deficits in psychotic disorders: a lifespan perspective. Neuropsychology Review, 28(4), 509533.CrossRefGoogle ScholarPubMed
Silverberg, N.D., & Millis, S.R. (2009). Impairment versus deficiency in neuropsychological assessment: Implications for ecological validity. Journal of the International Neuropsychological Society: JINS, 15(1), 94102.CrossRefGoogle ScholarPubMed
Tang, E.Y., Amiesimaka, O., Harrison, S.L., Green, E., Price, C., Robinson, L., … Stephan, B.C. (2018). Longitudinal effect of stroke on cognition: a systematic review. Journal of the American Heart Association, 7(2), e006443. https://doi.org/10.1161/JAHA.117.006443 CrossRefGoogle ScholarPubMed
Weintraub, S., Dikmen, S.S., Heaton, R.K., Tulsky, D.S., Zelazo, P.D., Bauer, P.J., … Gershon, R.C. (2013). Cognition assessment using the NIH Toolbox. Neurology, 80(11 Suppl 3), S54S64.CrossRefGoogle ScholarPubMed
Weintraub, S., Dikmen, S.S., Heaton, R.K., Tulsky, D.S., Zelazo, P.D., Slotkin, J., … Gershon, R. (2014). The cognition battery of the NIH toolbox for assessment of neurological and behavioral function: validation in an adult sample. Journal of the International Neuropsychological Society: JINS, 20(6), 567578.CrossRefGoogle Scholar
Figure 0

Table 1. Standard and prorated age-corrected standard scores (mean, SD) for fluid cognition and total cognition in healthy participants and those with stroke, psychosis, and mTBI. ICCs (95% CIs) between standard and prorated scores are given

Figure 1

Fig. 1. Bland–Altman plot for fluid cognition (top row) and total cognition (bottom row) prediction errors for healthy controls, stroke, psychosis, and mTBI groups, including mean group difference (blue dotted line). Participant level data are represented by circles. Threshold for acceptable prediction error was set ±0.5 SD (red lines) from zero (green dotted line). A prediction error of zero indicates equal standard and prorated scores.

Supplementary material: File

Rebchuk et al. supplementary material

Rebchuk et al. supplementary material 1

Download Rebchuk et al. supplementary material(File)
File 14.8 KB
Supplementary material: File

Rebchuk et al. supplementary material

Rebchuk et al. supplementary material 2

Download Rebchuk et al. supplementary material(File)
File 16.2 KB