Linking Rivermead Post Concussion Symptoms Questionnaire (RPQ) and Sport Concussion Assessment Tool (SCAT) scores with item response theory

Mary U. Simons; Lindsay D. Nelson; Michael A. McCrea; Steve Balsis; James B. Hoelzle; Brooke E. Magnus

doi:10.1017/S1355617722000807

Linking Rivermead Post Concussion Symptoms Questionnaire (RPQ) and Sport Concussion Assessment Tool (SCAT) scores with item response theory

Published online by Cambridge University Press: 03 November 2022

James B. Hoelzle and

Mary U. Simons*: Affiliation:
Department of Psychology, Marquette University, Milwaukee, WI, USA
Lindsay D. Nelson: Affiliation:
Department of Neurology, Medical College of Wisconsin, Milwaukee, WI, USA
Michael A. McCrea: Affiliation:
Department of Neurology, Medical College of Wisconsin, Milwaukee, WI, USA
Steve Balsis: Affiliation:
Department of Psychology, University of Massachusetts Lowell, Lowell, MA, USA
James B. Hoelzle: Affiliation:
Department of Psychology, Marquette University, Milwaukee, WI, USA
Brooke E. Magnus: Affiliation:
Department of Psychology and Neuroscience, Boston College, Chestnut Hill, MA, USA
*: Corresponding author: Mary Simons, email: mary.simons@marquette.edu

Article contents

Abstract
Objective:
Method:
Results:
Conclusion:
Method
Results
Discussion
Funding and Disclosure Statement
References

Rights & Permissions

Abstract

Objective:

Despite the public health burden of traumatic brain injury (TBI) across broader society, most TBI studies have been isolated to a distinct subpopulation. The TBI research literature is fragmented further because often studies of distinct populations have used different assessment procedures and instruments. Addressing calls to harmonize the literature will require tools to link data collected from different instruments that measure the same construct, such as civilian mild traumatic brain injury (mTBI) and sports concussion symptom inventories.

Method:

We used item response theory (IRT) to link scores from the Rivermead Post Concussion Symptoms Questionnaire (RPQ) and the Sport Concussion Assessment Tool (SCAT) symptom checklist, widely used instruments for assessing civilian and sport-related mTBI symptoms, respectively. The sample included data from n = 397 patients who suffered a sports-related concussion, civilian mTBI, orthopedic injury control, or non-athlete control and completed the SCAT and/or RPQ.

Results:

The results of several analyses supported sufficient unidimensionality to treat the RPQ + SCAT combined item set as measuring a single construct. Fixed-parameter IRT was used to create a cross-walk table that maps RPQ total scores to SCAT symptom severity scores. Linked and observed scores were highly correlated (r = .92). Standard errors of the IRT scores were slightly higher for civilian mTBI patients and orthopedic controls, particularly for RPQ scores linked from the SCAT.

Conclusion:

By linking the RPQ to the SCAT we facilitated efforts to effectively combine samples and harmonize data relating to mTBI.

Keywords

traumatic brain injury head trauma self-report psychometrics concussion neuropsychological testing

Type: Research Article
Information: Journal of the International Neuropsychological Society , Volume 29 , Issue 7 , August 2023 , pp. 696 - 703

DOI: https://doi.org/10.1017/S1355617722000807 [Opens in a new window]
Copyright: Copyright © INS. Published by Cambridge University Press, 2022

Traumatic brain injury (TBI) causes considerable morbidity throughout the world, with a global incidence of 369 per 100,000 (Badhiwala et al., Reference Badhiwala, Wilson and Fehlings2019). Worldwide, an estimated 55 million people are living with TBI-related disability (James et al., Reference James2019). Despite the universal reach of this injury in civilian, military, and other (e.g., athlete) populations, TBI research has been historically conducted in silos focused on specific subpopulations. Variability in study methodology across subpopulations of TBI, including differing outcome assessment batteries, is a barrier to comparing findings across studies and enabling the combining of datasets to accelerate discovery. Although TBI common data elements (CDEs) are now available (Hicks et al., Reference Hicks, Giacino, Harrison-Felix, Manley, Valadka and Wilde2013; Thurmond et al., Reference Thurmond, Hicks, Gleason, Miller, Szuflita, Orman and Schwab2010), they are neither universally required nor applicable to all TBI subpopulations. Further, in some domains, little is known about how two separate CDE measures intended to quantify the same or very similar constructs (e.g., TBI symptom scales) compare to each other. Thus, variability in study methods limits interpretation of findings across studies, particularly across studies of different subpopulations of TBI (e.g., sport-related concussion, civilian mild traumatic brain injury (mTBI)) which have different CDEs (Broglio et al., Reference Broglio, Kontos, Levin, Schneider, Wilde, Cantu, Feddermann-Demot, Fuller, Gagnon, Gioia, Giza, Griesbach, Leddy, Lipton, Mayer, McAllister, McCrea, McKenzie and Putukian2018; Hicks et al., Reference Hicks, Giacino, Harrison-Felix, Manley, Valadka and Wilde2013).

Just as the need for consistent data collection practices is being recognized, so is the potential utility of sharing and combining data across investigators. This is evidenced by international collaborative TBI research initiatives (e.g., the International Initiative for TBI Research [InTBIR]) as well as programs that enable or require data sharing, such as the Federal Interagency Traumatic Brain Injury Research (FITBIR) informatics system and the FAIR (findable, accessible, interoperable, and reusable) data principles being adopted by National Institutes of Health and others (see also Chou et al., Reference Chou, Torres-Espin, Huie, Krukowski, Lee, Nolan, Guglielmetti, Hawkins, Chaumeil, Manley, Beattie, Bresnahan, Martone, Grethe, Rosi and Ferguson2021; NIH, 2018; Thompson et al., Reference Thompson, Vavilala and Rivara2015). Advancing the goals of these initiatives will require developing methods to harmonize data collected from different instruments evaluating the same constructs. The present study used a modern psychometric approach to link the most commonly used TBI symptom inventories in civilian and sport-related TBI research—the Rivermead Post Concussion Symptoms Questionnaire (RPQ; King et al., Reference King, Crawford, Wenden, Moss and Wade1995) and the Sport Concussion Assessment Tool (SCAT; McCrory et al., Reference McCrory, Meeuwisse, Aubry, Cantu, Dvorak, Echemendia, Engebretsen, Johnston, Kutcher, Raftery, Sills, Benson, Davis, Ellenbogen, Guskiewicz, Herring, Iverson, Jordan, Kissick and Turner2013, Reference McCrory, Meeuwisse, Dvorak, Aubry, Bailes, Broglio, Cantu, Cassidy, Echemendia, Castellani, Davis, Ellenbogen, Emery, Engebretsen, Feddermann-Demot, Giza, Guskiewicz, Herring, Iverson and & Vos2017).

TBI symptom inventories are frequently used to assess TBI outcomes in both clinical and research settings (McCrory et al., Reference McCrory, Meeuwisse, Dvorak, Aubry, Bailes, Broglio, Cantu, Cassidy, Echemendia, Castellani, Davis, Ellenbogen, Emery, Engebretsen, Feddermann-Demot, Giza, Guskiewicz, Herring, Iverson and & Vos2017). TBI symptoms measured soon after injury are robust predictors of later clinical outcomes (Mikolić et al., Reference Mikolić, van Klaveren, Groeniger, Wiegers, Lingsma, Zeldovich, von Steinbüchel, Maas, Roeters van Lennep and Polinder2021; Silverberg et al., Reference Silverberg, Gardner, Brubacher, Panenka, Li and Iverson2015) and, in the large mTBI population, are more prevalent and persisting than other TBI sequelae (e.g., functional limitations, cognitive impairment; Dikmen et al., Reference Dikmen, Machamer and Temkin2017). Furthermore, symptoms associated with TBI (e.g., depression, anxiety) appear to directly drive long-term disability (Zahniser et al., Reference Zahniser, Nelson, Dikmen, Machamer, Stein, Yuh, Manley and Temkin2019). Despite considerable overlap in item content, the RPQ and SCAT are typically used in the distinct subpopulations of TBI for which they were developed—civilian and athlete populations, respectively. Therefore, linking RPQ and SCAT scores in a mixed civilian and sport population will enable researchers to combine data across these diverse populations to fuel studies requiring larger sample sizes and to improve understanding of how TBI operates in these two subpopulations.

A variety of statistical methods exist for linking measures, from simpler approaches such as linear equating with observed scores, to more complex approaches such as item response theory (IRT; Fayers & Hays, Reference Fayers and Hays2014). Unlike regression-based approaches, which do not account for measurement error and may result in biased mappings (Lu et al., Reference Lu, Brazier and Ades2013), IRT approaches calibrate each scale onto the latent variable, thus accounting for measurement error and reducing potential bias (e.g., see Kaat et al., Reference Kaat, Blackwell, Estabrook, Burns, Petitclerc, Briggs-Gowan, Gershon, Cella, Perlman and Wakschlag2019). Therefore, the present study used fixed-parameter calibration IRT to allow for a single item calibration for all items, offering a more rigorously established cross-walk between the RPQ and SCAT (Choi et al., Reference Choi, Schalet, Cook and Cella2014).

Method

Participants/study design

Data (n = 397) were obtained through two prior prospective studies of mTBI conducted in Wisconsin. The research was completed in accordance with the Helsinki Declaration and all testing procedures were approved by the Medical College of Wisconsin Institutional Review Board. Of the n = 397 participants, 198 completed the RPQ and SCAT, 198 participants completed only the SCAT, and 1 participant with sport-related mTBI did not complete either. The 396 participants with at least one inventory completed were included in analyses.

Civilian trauma sample

The civilian trauma sample (n = 154; n = 75 mTBI and n = 79 orthopedic controls) was recruited from our institution’s level 1 trauma center inpatient trauma unit between April 2015 and March 2016 (details have been reported previously in Guzowski et al., Reference Guzowski, Hoelzle, McCrea and Nelson2021). Participants completed a bedside assessment at enrollment (median 2 days post-injury), which included the SCAT and RPQ. The study used the American Congress of Rehabilitation Medicine’s (ACRM) definition of mTBI: “A traumatically-induced physiological disruption of brain function, manifested by at least one of the following: loss of consciousness (LOC; <30 min), memory loss for the events before or after (post-traumatic amnesia, PTA) injury (<24 hr PTA), other evidence of alteration of mental state immediately post-injury; or documentation of focal neurologic deficit after trauma; as well as initial Glasgow Coma Scale (GCS) score > 13.” All civilian mTBI participants met the athlete study definition of mTBI; civilian acute characteristics (e.g., posttraumatic amnesia, LOC) leaned toward more severe mTBI than experienced by athletes. Inclusion criteria for all participants were 18 years of age or older, English speaking, and admitted to the trauma service within the past 10 days for an eligible mTBI or other traumatic injury. Exclusion criteria for all participants were being in police custody or unable to independently provide informed consent. Orthopedic controls were required to have no evidence or report of head trauma.

Sport-related mTBI sample

The current project includes data from a longitudinal study of sport-related mTBI (concussion) in high school and collegiate athletes (details have been reported previously in Guzowski et al., Reference Guzowski, Hoelzle, McCrea and Nelson2021). Male football players with concussion (n = 105; 1 dropped due to missing outcome data) and non-injured athlete controls (n = 138) were assessed during the sport season with the SCAT symptom checklist and RPQ. (Post-concussion assessments were performed at 24–48 hr post-injury.) Adult athletes and parents of minor athletes completed written informed consent prior to assessment.

mTBI was diagnosed by licensed athletic trainers; injuries met the definition of concussion adopted from the Center of Disease Control and Prevention HEADS UP educational initiative: “An injury resulting from a forceful bump, blow, or jolt to the head that results in rapid movement of the head and causes a change in the athlete’s behavior, thinking, physical functioning, or the following symptoms: headache, nausea, blurred vision, memory difficulty, and difficulty concentrating.” Inclusion criteria at pre-season baseline enrollment encompassed participating in football, 14 years of age or older, English speaking, and capable of granting informed consent or assent. Exclusion criteria for post-injury follow-up (or follow-up as a non-injured control) encompassed contradictions to completing additional procedures for the parent study (i.e., neuroimaging, blood draws), current primary psychiatric disorder, current use of prescribed narcotics, history of or suspicion for significant neurological conditions (i.e., epilepsy, stroke, dementia), history of moderate or severe TBI, and history of concussion within the 6 months prior to the pre-season baseline exam.

Measures

Rivermead post concussion symptoms questionnaire (RPQ)

The RPQ is a self-report measure comprising 16 mTBI-related symptoms rated on a 5-point Likert scale (0 = not experienced at all, 4 = a severe problem). Participants are asked to report the degree to which symptoms have been problematic over the past 24 hr in comparison to their pre-injury symptom levels. As is recommended by the test author, scores of “1” (no more of a problem than pre-injury) were treated as “0” responses when computing total scores (range 0–64; e.g., see King et al., Reference King, Crawford, Wenden, Moss and Wade1995).

(Full RPQ: http://www.tbi-impact.org/cde/mod_templates/12_F_06_Rivermead.pdf)

Sport Concussion Assessment Tool (SCAT) symptom checklist

The SCAT (version 3/5) symptom checklist comprises 22 symptoms rated based on their current severity on a 7-point Likert Scale (0 = none, 6 = severe). Total symptom severity scores range from 0 to 132 points (e.g., see Echemendia et al., Reference Echemendia, Meeuwisse, McCrory, Davis, Putukian, Leddy, Makdissi, Sullivan, Broglio, Raftery, Schneider, Kissick, McCrea, Dvorak, Sills, Aubry, Engebretsen, Loosemore, Fuller and Herring2017).

(Full SCAT: https://bjsm.bmj.com/content/bjsports/early/2017/04/26/bjsports-2017-097506SCAT5.full.pdf)

Data analysis

Demographic information, descriptive statistics, and the internal consistency of the combined RPQ + SCAT were calculated using the psych package in R (Revelle & Revelle, Reference Revelle and Revelle2015). In addition, we evaluated several linking assumptions. One major assumption is that the measures being linked inherently assess the same construct (construct congruence; Dorans, Reference Dorans2007). To check this assumption, we examined the RPQ and SCAT for similar item content and response format. If two instruments measure the same construct, their scores should be highly correlated, both at the summed score level and at the item level. Bivariate correlations are a useful first step for establishing relationships at the summed score level. At the item level, construct congruence can be evaluated using factor analysis, which provides evidence as to whether an instrument (in our case the combined set of RPQ and SCAT items) is sufficiently unidimensional, to consider its content to measure the same construct. If there is a strong general factor of the combined items then there is evidence for sufficient unidimensionality, which indicates that a unidimensional IRT model will adequately reflect the item parameters and latent trait estimates of the general factor underlying the item responses (Reise et al., Reference Reise, Cook, Moore, Reise and Revicki2015). Our previous factor modeling studies on the SCAT and the RPQ have found both instruments to be sufficiently unidimensional as standalone instruments (Agtarap et al., 2020; Brett et al., Reference Brett, Kramer, McCrea, Broglio, McAllister and Nelson2020; Nelson et al., Reference Nelson, Kramer, Patrick and McCrea2018).

To investigate unidimensionality of the combined item set, we conducted exploratory factor analyses (EFA) and confirmatory factor analyses (CFA) of the 38 items in Mplus using the weighted least square mean and variance adjusted (WLSMV) appropriate for categorical data (7th edition; Mutheń & Mutheń, Reference Mutheń and Mutheń1998–2015). EFA was conducted to confirm that the ratio of the first to second eigenvalue indicated sufficient unidimensionality (commonly defined as a ratio > 4; Reeve et al., Reference Reeve, Hays, Bjorner, Cook, Crane, Teresi, Thissen, Revicki, Weiss, Hambleton, Lio, Gershon, Reise, Lai and Cella2007). Then, we evaluated the fit of a one-factor confirmatory factor analysis (CFA) model on the combined measure, considering adequate fit based on several conventions: comparative fit index (CFI) > .90, Tucker Lewis index (TLI) > .90, and root-mean-square-error of approximation (RMSEA) < .10 (Hopwood & Donnellan, Reference Hopwood and Donnellan2010; Lance et al., Reference Lance, Butts and Michels2006). Though these cut off scores are more lenient than some have recommended to declare good fit in factor modeling studies, they are generally considered adequate to establish sufficient unidimensionality for IRT modeling (e.g., see Choi et al., Reference Choi, Lim, Schalet, Kaat and Cella2021). In factor modeling research, others have advocated for applying more lenient cut scores and considering other evidence of model adequacy beyond formal fit statistics (see Hopwood & Donnellan, Reference Hopwood and Donnellan2010). Finally, we estimated two bifactor models to allow for potential multidimensionality of the combined measure and demonstrate evidence of an overarching general factor between the two measures. The first bifactor model contained the RPQ and SCAT (RPQ-SCAT bifactor) as the specific factors whereas the second (symptom subgroup bifactor) model grouped similar items from the RPQ and SCAT into established subgroups (emotional, cognitive, torpor, vision, sensory sensitivity, and headache) based on previously published bifactor analyses of the RPQ and SCAT, separately (e.g., see Agtarap et al., 2020; Nelson et al., Reference Nelson, Kramer, Patrick and McCrea2018). We used hierarchical omega (OmegaH) as a measure of fit for the bifactor models, which estimates the proportion of variance in total scores that can be attributed to the general factor, TBI-related symptom burden, and the proportion of unique variance that can be attributed to the specific factors (Rodriguez et al., Reference Rodriguez, Reise and Haviland2016). Typically, hierarchical omega values of .8 or greater suggest sufficient unidimensionality because the proportion of variance explained is highly attributable to the general factor. This essentially means the secondary factors have no meaningful influence on unidimensional IRT parameter estimates (Rodriguez et al., Reference Rodriguez, Reise and Haviland2016). Similar values can also be calculated for specific factors, which are expected to be low in the case of sufficient unidimensionality.

IRT linking methods also assume population invariance, meaning score differences between subgroups for one measure are similar to the score differences between the same subgroups on the second measure. The standardized root-mean-square deviation (RMSD), which is a weighted difference between the standardized difference of subpopulations (e.g., mTBI and control) across two measures, provides one method of quantifying these differences (Dorans & Holland, Reference Dorans and Holland2000). According to Dorans and Holland (Reference Dorans and Holland2000), population invariance can be assumed for RMSD values of less than .08. We computed the RMSD using the SEAsic package in R to evaluate population invariance by subgroups with and without mTBI.

We used a fixed-parameter IRT graded response model (GRM) via the PROsetta package in R (Choi et al., Reference Choi, Lim, Schalet, Kaat and Cella2021) to establish the parameters for the RPQ and SCAT. We used the GRM because it can accommodate scales with polytomous, ordinal responses. Via this approach, for each item a set of parameters are generated that estimate the extent to which each item indicates the latent variable (i.e., TBI symptom-related burden). Item parameters include item thresholds (also called b parameters or item difficulty parameters) and item discriminations (also called a parameters; Thomas, Reference Thomas2011). In the GRM, an item threshold indicates the level of the latent trait needed to have a 50% probability of responding above a given response option on an ordinal scale (see b1–b6 parameters in Supplemental Table 1; Thomas, Reference Thomas2011). An item discrimination parameter reflects the strength of the relationship between an item and the latent trait. Taken together, these item parameters define the likelihood an item is endorsed at all levels of the latent continuum. Finally, by charting these likelihoods as described in our previous work (Balsis et al., Reference Balsis, Benge, Lowe, Geraci and Doody2015), we were able to link the raw scores on these two important instruments.

Due to our data collection method in which a substantial proportion of participants completed both the RPQ and SCAT, we employed fixed-parameter IRT to link the RPQ and SCAT, which is a conceptually and computationally simple method that involves an initial calibration of one measure (considered the “anchor” measure) before combining all items into a single measure. In a second calibration, the item parameters for the anchor are fixed at their initial calibration values, while the parameters for the other measure are freely estimated (Choi et al., Reference Choi, Schalet, Cook and Cella2014). This yields parameters for the non-anchor (linked) measure that are on the same metric as the anchor measure. We treated the SCAT as the anchor, which allows us to use responses to the RPQ items to determine what a person would have scored on the SCAT. As a sensitivity analysis, we ran a second fixed parameter IRT using the RPQ as an anchor and confirmed it resulted in comparable results. Finally, after linking the instruments we compared the precision of IRT-based estimates of symptom severity for SCAT and RPQ linked scores by plotting the standard error of IRT estimates across the latent continuum of symptom severity, stratified by subgroup. Finding discrepancies in the accuracy of linked score estimates for sport and civilian populations may have implications for best practices and when linking instruments to combine sport and civilian datasets.

Results

Demographic information for the civilian and sport samples is presented in Table 1.

Table 1. Sample demographics and injury characteristics

Note. mTBI, mild traumatic brain injury.

^a Yes and suspected categories collapsed; no and unknown categories collapsed.

Overall, the sample was predominantly male (83.8%) and white (74.7%), and ranged widely in age (14–90, M [SD] = 30.2 [19.2]). As a measure of population invariance, the RMSD was calculated for head-injured (combining civilian and athlete mTBI) versus control groups (combining orthopedic injury and non-injured controls). The RMSD value was 8%, suggesting population invariance with respect to injury status.

An overview of item content (see Supplemental Tables) revealed significant overlap, including many identical or nearly identical items spanning somatic (e.g. headaches, photosensitivity, phonosensivity), cognitive (e.g. concentration and memory difficulties), and emotional complaints (e.g. irritability, low mood). The results of several analyses supported sufficient unidimensionality to treat the RPQ + SCAT combined item set as measuring a single construct. First, internal consistency reliability (coefficient alpha) was high (.94) for the RPQ + SCAT combined item set. Second, the first:second eigenvalue ratio from an EFA of the combined item set was consistent with sufficient unidimensionality (Eigenvalues 1–4 were 24.46, 2.65, 1.65, and 1.38, respectively). The one-factor EFA and CFA demonstrated good fit (χ ²[665] = 2,441, p < .001, CFI = .94, TLI = .93, RMSEA = .082. The RPQ-SCAT bifactor CFA and symptom subgroup bifactor CFA offered modestly improved fit over the unidimensional model (RPQ-SCAT bifactor model χ ²[627] = 2,055.15, p < .001, CFI = .95, TLI = .94, RMSEA = .076; symptom subgroup bifactor model χ ²[638] = 13,99.98, p < .001, CFI = .97, TLI = .97, RMSEA = .054). However, omega values for the bifactor models indicated that the general factor explains the vast majority of variance in total sores for the combined instrument (99% and 95% for the RPQ-SCAT and symptom subgroup bifactor models, respectively). These results corroborate our previous work that found sufficient unidimensionality of the construct as indexed by these inventories (see Agtarap et al., Reference Agtarap, Kramer, Campbell-Sills, Yuh, Mukherjee, Manley, McCrea, Dikmen, Giacino, Stein and Nelson2021 [RPQ]; Brett et al., Reference Brett, Kramer, McCrea, Broglio, McAllister and Nelson2020 [SCAT]; and Nelson et al., Reference Nelson, Kramer, Patrick and McCrea2018 [SCAT]), providing further evidence that the RPQ + SCAT combined instrument meets criteria for sufficient unidimensionality.

Table 2 presents the cross-walk table of SCAT and RPQ scores derived from the fixed-parameter calibration. Figure 1 illustrates how the model enables linking the SCAT and RPQ according to where their scores fall on the latent continuum of overall symptom severity. Item parameters for the IRT model of the combined SCAT + RPQ item set can be found in the Supplement.

Table 2. Linked correspondence between RPQ and SCAT scores

Note. Scores linked through the latent symptom severity dimension (theta) using item response theory. The SCAT score that most closely corresponds with the RPQ score is presented first. Because the SCAT encompasses a larger range than the RPQ, the SCAT scores which correspond to one RPQ score are then presented in parentheses as a range.

Figure 1. Correspondence between RPQ total scores and SCAT symptom severity scores, linked on the latent dimension of mTBI symptom severity. Note. This figure demonstrates the correspondence between RPQ total scores and SCAT severity scores. The x-axis depicts the latent dimension of mTBI symptom severity, along which individuals vary. The y-axis provides the expected total score for RPQ (solid line) and SCAT (dashed line) at each level of the latent dimension. For example, an individual with a latent symptom severity level of 1.55 would receive an RPQ score of 39 and a SCAT score of 76.

Figure 2 depicts a scatterplot showing the relationship between SCAT scores estimated by the RPQ and the sample’s observed SCAT scores. Overall, observed SCAT scores and linked (from RPQ) SCAT scores were highly correlated (r = .92, p < .001), which supports the use of the crosswalk table. The mean difference between linked and observed SCAT scores was -0.74 with a standard deviation of 10.5.

Figure 2. Scatterplot predicting the relationship between observed and linked (from RPQ) SCAT scores. Note. This scatterplot demonstrates the relationship between observed (y-axis) and linked (x-axis) SCAT scores. The observed scores are the scores measures directly using the SCAT. The linked scores are SCAT scores that are predicted from the RPQ. The black line is the regression line representing the relationship between the observed and linked SCAT scores, the dotted lines are the confidence intervals, and the dots are the data points. There was a high correlation (r = .92) between observed and linked SCAT scores, supporting the validity of the cross-walk table provided in Table 2.

Lastly, Figure 3 shows the relationship between IRT scale scores (interpreted on a Z-score metric with a mean of 0 and standard deviation of 1) and standard errors for the linked versions of the SCAT and RPQ, respectively, for each subgroup. For both measures, civilian mTBI and sport-related concussion patients show the widest range of scores (x-axis); non-injured athlete controls show the narrowest range, with most scores falling at or below average. Standard errors tend to be slightly higher for civilian mTBI patients and orthopedic controls, particularly for RPQ scores linked from the SCAT.

Figure 3. Scatterplot demonstrating the standard error of model-estimated latent symptom severity (y-axis) across levels of severity (x-axis) for linked SCAT and RPQ scores, stratified by group (sport-related mTBI, civilian mTBI, orthopedic injury civilian control, non-injured athlete control). Note. The data points represent the relationship between standard error and latent severity. The top figure contains more data points due to more participants completing the SCAT than the RPQ.

Discussion

The RPQ and SCAT are among the most widely used TBI symptom inventories in the civilian and sport mTBI populations, respectively. We demonstrated that the RPQ and SCAT measure the same construct and that their total scores can be linked through latent variable modeling techniques. The cross-walk table offered in this manuscript enables users of either inventory to convert scores from one scale to the other and thereby compare the myriad published findings reported from the two scales or combine datasets that used either RPQ or SCAT. There was a strong (r = .92) correlation between observed and linked SCAT scores, supporting the use of the IRT-based crosswalk table for obtaining scores on each measure. Future researchers can utilize these results to create larger and more diverse samples from existing datasets and harmonize future research on sport-related and civilian TBI, as well as allow for direct clinical comparison across groups.

Previous research reveals that civilians experience a high base rate of mTBI-like symptoms in the absence of injury (Iverson & Lange, Reference Iverson and Lange2003). Although the SCAT and RPQ have very similar item content, they are distinguished by different rating scales that theoretically makes the RPQ more accurate to measure mTBI-related symptoms in persons with more pre-injury symptoms. In particular, the SCAT solicits ratings of current symptom severity (presumably reflecting both pre-injury and injury-related symptoms), while the RPQ solicits ratings of injury-related symptoms. It is perhaps not surprising that standard errors of RPQ IRT scores linked from the SCAT appeared somewhat higher in the civilian subgroups within our sample, whereas SCAT IRT scores linked from the RPQ showed less variability in precision (standard error) across groups. Practically, this finding supports the self-evident, preferable practice to use RPQ scores to predict SCAT scores than vice versa, particularly when pre-injury symptoms are expected to be prevalent. However, overall the differences in measurement precision across the groups were minimal. Regarding outcomes, a clinician working with an athlete who completed the SCAT can consider the client’s score relative to the findings from the SCAT literature, but also now relative to the RPQ civilian literature. Taken together, this research has further clinical implications, as it provides strong evidence of construct congruence and allows for the direct comparison of mTBI symptoms across athlete, civilian, and control groups who completed a combination of these two symptom inventories.

A strength of the current study was our use of IRT, which linked RPQ and SCAT total scores based on their relationship to the latent dimension (TBI-related symptom burden) that drives scores on both measures. Other methods of linking, such as linear equating, assume scores are linearly related and reflect the underlying construct without measurement error. These assumptions may potentially lead to biased mapping. Therefore, IRT linking methods have demonstrated more reliability and precision than other methods of linking, such as Deming regression (e.g., see Kaat et al., Reference Kaat, Blackwell, Estabrook, Burns, Petitclerc, Briggs-Gowan, Gershon, Cella, Perlman and Wakschlag2019) and equipercentile linking (e.g., see Choi et al., Reference Choi, Schalet, Cook and Cella2014). While several IRT-based linking methods are available, we chose fixed-parameter calibration, a method made possible by the fact that many participants in the current sample completed both the RPQ and SCAT (Dorans, Reference Dorans2007). To our knowledge, one other study has linked the RPQ and the SCAT. Langer et al. (Reference Langer, Comper, Ruttan, Saverino, Alavinia, Inness, Kam, Lawrence, Tam, Chandra, Foster and Bayley2021) used linear equating methods to create an equation that allows for summed score conversion between the RPQ and SCAT. Linear equating may be biased by large score distribution differences (Muraki et al., Reference Muraki, Hombo and Lee2000), such as the distribution difference seen between the RPQ and SCAT. Therefore, our IRT analysis may offer a more accurate conversion. Furthermore, the current study provides a simple-to-use score conversion table, requiring no further computations for researchers or clinicians.

Some limitations of the current study must be addressed. One potential limitation is a relatively modest sample size for IRT analysis, which may have led to reduced accuracy of parameter estimates and thus have affected the linking table. Though, on balance, this sample is one of the largest samples of its type, making it perhaps one of the best available from which to conduct a linking analysis. Second, although our population invariance analysis supported the combination of injured and non-injured groups, invariance among other subgroups cannot be confirmed in this sample. Future studies using even larger datasets will need to investigate this issue. Furthermore, our sample demographics may not be perfectly representative of the entire TBI population, but was more diverse than many TBI samples (e.g., those restricted to sport-related mTBI). While our predominantly white (75%), male (84%) sample may limit the generalizability of this research to other groups, there is no strong evidence to indicate that IRT parameters would be biased by sample demographics (e.g., we have previously reported strict measurement invariance of the RPQ factor structure across age, gender, race, and other groups; Agtarap et al., Reference Agtarap, Kramer, Campbell-Sills, Yuh, Mukherjee, Manley, McCrea, Dikmen, Giacino, Stein and Nelson2021). Lastly, as is required for this linking method, rounding decisions were made to produce the crosswalk table. However, our sensitivity analysis utilizing the RPQ as an anchor (instead of the reported results using SCAT as the anchor) produced a nearly identical cross walk table, indicating minimal error introduced by such methodological decisions.

The main purpose of this analysis was to link the total scores of the RPQ and SCAT. We focused on the total scores due to their clinical and research relevance, as total scores are most defensible given the psychometric properties of these two instruments (e.g., see Agatrap et al., 2020; and Nelson et al., Reference Nelson, Kramer, Patrick and McCrea2018) and are the most frequently used in both clinical and research settings. Given the similarity in item content across the two scales, future research could consider linking scores from comparable domains evaluated across both inventories.

In conclusion, we used fixed-parameter IRT calibration in order to produce a cross-walk table linking the summed scores on the RPQ and SCAT. This research provides the opportunity for future researchers to compare findings across the published studies that used these popular mTBI symptom checklists and combine datasets spanning populations to further our understanding of TBI symptom overlap across subpopulations.

Funding and Disclosure Statement

This secondary data analysis project was funded by the National Institute of Neurological Disorders and Stroke grant # R01 NS110856. The original studies were supported by the Defense Health Program under the Department of Defense Broad Agency Announcement for Extramural Medical Research through Award No. W81XWH-14-1-0561, and the Research and Education Program Fund, a component of the Advancing a Healthier Wisconsin (AHW) endowment of the Medical College of Wisconsin. The study REDCap databases were supported by the National Center for Advancing Translational Sciences, National Institutes of Health (NIH), through Grant Numbers 8UL1TR000055 and 1UL1- RR031973 (-01). Opinions, interpretations, conclusions and recommendations are those of the authors and are not necessarily endorsed by the NIH, Department of Defense, or AHW. The authors report no disclosures relevant to the manuscript.

References

Agtarap, S., Kramer, M. D., Campbell-Sills, L., Yuh, E., Mukherjee, P., Manley, G. T., McCrea, M.A., Dikmen, S., Giacino, J.T., Stein, M.B., Nelson, L. D., & TRACK-TBI Investigators (2021). Invariance of the bifactor structure of mild traumatic brain injury (mTBI) symptoms on the rivermead postconcussion symptoms questionnaire across time, demographic characteristics, and clinical groups: A TRACK-TBI study. Assessment, 28, 1656–1670.CrossRef Google Scholar PubMed

Badhiwala, J. H., Wilson, J. R., & Fehlings, M. G. (2019). Global burden of traumatic brain and spinal cord injury. The Lancet Neurology, 18, 24–25.CrossRef Google Scholar PubMed

Balsis, S., Benge, J. F., Lowe, D. A., Geraci, L., & Doody, R. S. (2015). How do scores on the ADAS-Cog, MMSE, and CDR-SOB correspond? Clinical Neuropsychology, 29, 1002–1009.CrossRef Google Scholar PubMed

Brett, B. L., Kramer, M. D., McCrea, M. A., Broglio, S. P., McAllister, T. W., Nelson, L. D., & The CARE Consortium Investigators (2020). Bifactor model of the sport concussion assessment tool symptom checklist: Replication and invariance across time in the CARE consortium sample. The American Journal of Sports Medicine, 48, 2783–2795.CrossRef Google Scholar PubMed

Broglio, S. P., Kontos, A. P., Levin, H., Schneider, K., Wilde, E. A., Cantu, R. C., Feddermann-Demot, N., Fuller, G. W., Gagnon, I., Gioia, G. A., Giza, C., Griesbach, G. S., Leddy, J. J., Lipton, M. L., Mayer, A. R., McAllister, T. W., McCrea, M., McKenzie, L. B., Putukian, M., & Sport Related Concussion CDE Working Group. (2018). National institute of neurological disorders and stroke and department of defense sport-related concussion common data elements version 1.0 recommendations. Journal of Neurotrauma, 35, 2776–2783.CrossRef Google Scholar PubMed

Choi, S., Lim, S., Schalet, B., Kaat, A., & Cella, D. (2021). PROsetta: An R package for linking patient-reported outcome measures. Applied Psychological Measurement, 45, 386–388. https://doi.org/10.1177/01466216211013106.CrossRef Google Scholar

Choi, S. W., Schalet, B., Cook, K. F., & Cella, D. (2014). Establishing a common metric for depressive symptoms: Linking the BDI-II, CES-D, and PHQ-9 to PROMIS depression. Psychological Assessment, 26, 513.10.1037/a0035768CrossRef Google Scholar PubMed

Chou, A. C., Torres-Espin, A., Huie, J. R., Krukowski, K., Lee, S., Nolan, A., Guglielmetti, C., Hawkins, B. E., Chaumeil, M. M., Manley, G. T., Beattie, M. S., Bresnahan, J. C., Martone, M. E., Grethe, J. S., Rosi, S., & Ferguson, A. R. (2021). Open data commons for preclinical traumatic brain injury research: Empowering data sharing and big data analytics. bioRxiv. https://doi.org/10.1101/2021.03.15.435178 Google Scholar

Dikmen, S., Machamer, J., & Temkin, N. (2017). Mild traumatic brain injury: Longitudinal study of cognition, functional status, and post-traumatic symptoms. Journal of Neurotrauma, 34, 1524–1530.10.1089/neu.2016.4618CrossRef Google Scholar PubMed

Dorans, N. J. (2007). Linking scores from multiple health outcome instruments. Quality of Life Research, 16, 85–94.CrossRef Google Scholar PubMed

Dorans, N. J., & Holland, P. W. (2000). Population invariance and the equatability of tests: Basic theory and the linear case. Journal of Educational Measurement, 37, 281–306. https://doi.org/10.1111/j.1745-3984.2000.tb01088.x CrossRef Google Scholar

Echemendia, R. J., Meeuwisse, W., McCrory, P., Davis, G. A., Putukian, M., Leddy, J., Makdissi, M., Sullivan, S. J., Broglio, S. P., Raftery, M., Schneider, K., Kissick, J., McCrea, M., Dvorak, J., Sills, A. K., Aubry, M., Engebretsen, L., Loosemore, M., Fuller, G., & Herring, S. (2017). The sport concussion assessment tool 5th edition (SCAT5): Background and rationale. British Journal of Sports Medicine, 51, 848–850.Google Scholar PubMed

Fayers, P. M., & Hays, R. D. (2014). Should linking replace regression when mapping from profile-based measures to preference-based measures? Value in Health, 17, 261–265.CrossRef Google Scholar PubMed

Furger, R. E., Nelson, L. D., Lerner, E. B., & McCrea, M. A. (2016). Frequency of factors that complicate the identification of mild traumatic brain injury in level I trauma center patients. Concussion, 1, CNC11.CrossRef Google Scholar PubMed

Guzowski, N. S., Hoelzle, J. B., McCrea, M. A., & Nelson, L. D. (2021). Differing associations between measures of somatic symptom reporting, personality, and mild traumatic brain injury (mTBI). The Clinical Neuropsychologist, 1–18.Google Scholar PubMed

Hicks, R., Giacino, J., Harrison-Felix, C., Manley, G., Valadka, A., & Wilde, E. A. (2013). Progress in developing common data elements for traumatic brain injury research: Version two–the end of the beginning. Journal of Neurotrauma, 30, 1852–1861.CrossRef Google Scholar PubMed

Hopwood, C. J., & Donnellan, M. B. (2010). How should the internal structure of personality inventories be evaluated? Personality and Social Psychology Review, 14, 332–346. CrossRef Google Scholar PubMed

Iverson, G. L., & Lange, R. T. (2003). Examination of” postconcussion-like” symptoms in a healthy sample. Applied Neuropsychology, 10, 137–144.CrossRef Google Scholar

James, S. L., & GBD 2016 Traumatic Brain Injury and Spinal Cord Injury Collaborators. (2019). Global, regional, and national burden of traumatic brain injury and spinal cord injury, 1990–2016: a systematic analysis for the global burden of disease study 2016. The Lancet Neurology, 18, 56–87.CrossRef Google Scholar

Kaat, A. J., Blackwell, C. K., Estabrook, R., Burns, J. L., Petitclerc, A., Briggs-Gowan, M. J., Gershon, R. C., Cella, D., Perlman, S. B., & Wakschlag, L. S. (2019). Linking the child behavior checklist (CBCL) with the multidimensional assessment profile of disruptive behavior (MAP-DB): Advancing a dimensional spectrum approach to disruptive behavior. Journal of Child and Family Studies, 28, 343–353.CrossRef Google Scholar PubMed

King, N. S., Crawford, S., Wenden, F. J., Moss, N. E. G., & Wade, D. T. (1995). The rivermead post concussion symptoms questionnaire: A measure of symptoms commonly experienced after head injury and its reliability. Journal of Neurology, 242, 587–592.CrossRef Google Scholar PubMed

Lance, C. E., Butts, M. M., & Michels, L. C. (2006). The sources of four commonly reported cutoff criteria: What did they really say? Organizational Research Methods, 9, 202–220. https://doi.org/10.1177/1094428105284919 CrossRef Google Scholar

Langer, L. K., Comper, P., Ruttan, L., Saverino, C., Alavinia, S. M., Inness, E. L., Kam, A., Lawrence, D. W., Tam, A., Chandra, T., Foster, E., & Bayley, M. T. (2021). Can sport concussion assessment tool (SCAT) symptom scores be converted to rivermead post-concussion symptoms questionnaire (RPQ) scores and vice versa? Findings from the Toronto concussion study. Frontiers in Sports and Active Living, 3, 737402.10.3389/fspor.2021.737402CrossRef Google Scholar PubMed

Lu, G., Brazier, J. E., & Ades, A. E. (2013). Mapping from disease-specific to generic health related quality-of-life scales: A common factor model. Value in Health, 16, 177–184.CrossRef Google Scholar

McCrory, P., Meeuwisse, W., Dvorak, J., Aubry, M., Bailes, J., Broglio, S., Cantu, R. C., Cassidy, D., Echemendia, R. J., Castellani, R. J., Davis, G. A., Ellenbogen, R., Emery, C., Engebretsen, L., Feddermann-Demot, N., Giza, C. C., Guskiewicz, K. M., Herring, S., Iverson, G. L., & Vos, P. E. (2017). Consensus statement on concussion in sport—the 5th international conference on concussion in sport held in Berlin, October 2016. British Journal of Sports Medicine, 51, 838–847.Google Scholar PubMed

McCrory, P., Meeuwisse, W. H., Aubry, M., Cantu, B., Dvorak, J., Echemendia, R. J., Engebretsen, L., Johnston, K. M., Kutcher, J. S., Raftery, M., Sills, A., Benson, B. W., Davis, G. A., Ellenbogen, R., Guskiewicz, K. M., Herring, S. A., Iverson, G. L., Jordan, B. D., Kissick, J., & Turner, M. (2013). Consensus statement on concussion in sport: The 4th international conference on concussion in sport held in Zurich, November 2012. British Journal of Sports Medicine, 47, 250–258. –https://doi.org/10.1136/bjsports-2013–092313 CrossRef Google Scholar PubMed

Mikolić, A., van Klaveren, D., Groeniger, J. O., Wiegers, E. J., Lingsma, H. F., Zeldovich, M., von Steinbüchel, N., Maas, A. I. R., Roeters van Lennep, J. E., Polinder, S., & CENTER-TBI Participants and Investigators. (2021). Differences between men and women in treatment and outcome after traumatic brain injury. Journal of Neurotrauma, 38, 235–251.CrossRef Google Scholar PubMed

Muraki, E., Hombo, C. M., & Lee, Y. W. (2000). Equating and linking of performance assessments. Applied Psychological Measurement, 24, 325–337.CrossRef Google Scholar

Mutheń, L. K., & Mutheń, B. O. (1998–2015). Mplus user’s guide (7th ed.). Mutheń & Mutheń.Google Scholar

National Institutes of Health. (2018). NIH strategic plan for data science. National Institutes of Health, Office of Data Science Strategy.Google Scholar

Nelson, L. D., Kramer, M. D., Patrick, C. J., & McCrea, M. A. (2018). Modeling the structure of acute sport-related concussion symptoms: A bifactor approach. Journal of the International Neuropsychological Society, 24, 793–804.CrossRef Google Scholar PubMed

Reeve, B. B., Hays, R. D., Bjorner, J. B., Cook, K. F., Crane, P. K., Teresi, J. A., Thissen, D., Revicki, D. A., Weiss, D. J., Hambleton, R. K., Lio, H., Gershon, R., Reise, S. P., Lai, J. S., & Cella, D. (2007). Psychometric evaluation and calibration of health-related quality of life item banks: Plans for the patient-reported outcomes measurement information system (PROMIS). Medical Care, 45, S22–S31.CrossRef Google Scholar PubMed

Reise, S. P., Cook, K. F., & Moore, T. M. (2015). Evaluating the impact of multidimensionality on unidimensional item response theory model parameters. In Reise, S. P. & Revicki, D. A. (Eds.), Handbook of item response theory modeling: Applications to typical performance assessment (pp. 13–40). Routledge/Taylor & Francis Group.Google Scholar

Revelle, W., & Revelle, M. W. (2015). Package ‘psych’. The comprehensive R Archive Network, 337, 338.Google Scholar

Rodriguez, A., Reise, S. P., & Haviland, M. G. (2016). Evaluating bifactor models: Calculating and interpreting statistical indices. Psychological Methods, 21, 137.CrossRef Google Scholar PubMed

Rosseel, Y. (2012). “lavaan: An R package for structural equation modeling.” Journal of Statistical Software, 48, 1–36. https://doi.org/10.18637/jss.v048.i02.CrossRef Google Scholar

Silverberg, N. D., Gardner, A. J., Brubacher, J. R., Panenka, W. J., Li, J. J., & Iverson, G. L. (2015). Systematic review of multivariable prognostic models for mild traumatic brain injury. Journal of Neurotrauma, 32, 517–526.10.1089/neu.2014.3600CrossRef Google Scholar PubMed

Thomas, M. L. (2011). The value of item response theory in clinical assessment: A review. Assessment, 18, 291–307.CrossRef Google Scholar PubMed

Thompson, H. J., Vavilala, M. S., & Rivara, F. P. (2015). Common data elements and federal interagency traumatic brain injury research informatics system for TBI research. Annual Review of Nursing Research, 33, 1–11.CrossRef Google Scholar PubMed

Thurmond, V. A., Hicks, R., Gleason, T., Miller, A. C., Szuflita, N., Orman, J., & Schwab, K. (2010). Advancing integrated research in psychological health and traumatic brain injury: Common data elements. Archives of Physical Medicine and Rehabilitation, 91, 1633–1636.CrossRef Google Scholar PubMed

Zahniser, E., Nelson, L. D., Dikmen, S. S., Machamer, J. E., Stein, M. B., Yuh, E., Manley, G. T., Temkin, N. R., & TRACK-TBI Investigators. (2019). The temporal relationship of mental health problems and functional limitations following mTBI: A TRACK-TBI and TED study. Journal of Neurotrauma, 36, 1786–1793.CrossRef Google Scholar PubMed

Table 1. Sample demographics and injury characteristics

Table 2. Linked correspondence between RPQ and SCAT scores

Figure 1. Correspondence between RPQ total scores and SCAT symptom severity scores, linked on the latent dimension of mTBI symptom severity. Note. This figure demonstrates the correspondence between RPQ total scores and SCAT severity scores. The x-axis depicts the latent dimension of mTBI symptom severity, along which individuals vary. The y-axis provides the expected total score for RPQ (solid line) and SCAT (dashed line) at each level of the latent dimension. For example, an individual with a latent symptom severity level of 1.55 would receive an RPQ score of 39 and a SCAT score of 76.

Figure 2. Scatterplot predicting the relationship between observed and linked (from RPQ) SCAT scores. Note. This scatterplot demonstrates the relationship between observed (y-axis) and linked (x-axis) SCAT scores. The observed scores are the scores measures directly using the SCAT. The linked scores are SCAT scores that are predicted from the RPQ. The black line is the regression line representing the relationship between the observed and linked SCAT scores, the dotted lines are the confidence intervals, and the dots are the data points. There was a high correlation (r = .92) between observed and linked SCAT scores, supporting the validity of the cross-walk table provided in Table 2.

Figure 3. Scatterplot demonstrating the standard error of model-estimated latent symptom severity (y-axis) across levels of severity (x-axis) for linked SCAT and RPQ scores, stratified by group (sport-related mTBI, civilian mTBI, orthopedic injury civilian control, non-injured athlete control). Note. The data points represent the relationship between standard error and latent severity. The top figure contains more data points due to more participants completing the SCAT than the RPQ.

Article contents

Linking Rivermead Post Concussion Symptoms Questionnaire (RPQ) and Sport Concussion Assessment Tool (SCAT) scores with item response theory

Abstract

Keywords

Method

Participants/study design

Civilian trauma sample

Sport-related mTBI sample

Measures

Rivermead post concussion symptoms questionnaire (RPQ)

Sport Concussion Assessment Tool (SCAT) symptom checklist

Data analysis

Results

Discussion

Funding and Disclosure Statement

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests