Traumatic brain injury (TBI) causes considerable morbidity throughout the world, with a global incidence of 369 per 100,000 (Badhiwala et al., Reference Badhiwala, Wilson and Fehlings2019). Worldwide, an estimated 55 million people are living with TBI-related disability (James et al., Reference James2019). Despite the universal reach of this injury in civilian, military, and other (e.g., athlete) populations, TBI research has been historically conducted in silos focused on specific subpopulations. Variability in study methodology across subpopulations of TBI, including differing outcome assessment batteries, is a barrier to comparing findings across studies and enabling the combining of datasets to accelerate discovery. Although TBI common data elements (CDEs) are now available (Hicks et al., Reference Hicks, Giacino, Harrison-Felix, Manley, Valadka and Wilde2013; Thurmond et al., Reference Thurmond, Hicks, Gleason, Miller, Szuflita, Orman and Schwab2010), they are neither universally required nor applicable to all TBI subpopulations. Further, in some domains, little is known about how two separate CDE measures intended to quantify the same or very similar constructs (e.g., TBI symptom scales) compare to each other. Thus, variability in study methods limits interpretation of findings across studies, particularly across studies of different subpopulations of TBI (e.g., sport-related concussion, civilian mild traumatic brain injury (mTBI)) which have different CDEs (Broglio et al., Reference Broglio, Kontos, Levin, Schneider, Wilde, Cantu, Feddermann-Demot, Fuller, Gagnon, Gioia, Giza, Griesbach, Leddy, Lipton, Mayer, McAllister, McCrea, McKenzie and Putukian2018; Hicks et al., Reference Hicks, Giacino, Harrison-Felix, Manley, Valadka and Wilde2013).
Just as the need for consistent data collection practices is being recognized, so is the potential utility of sharing and combining data across investigators. This is evidenced by international collaborative TBI research initiatives (e.g., the International Initiative for TBI Research [InTBIR]) as well as programs that enable or require data sharing, such as the Federal Interagency Traumatic Brain Injury Research (FITBIR) informatics system and the FAIR (findable, accessible, interoperable, and reusable) data principles being adopted by National Institutes of Health and others (see also Chou et al., Reference Chou, Torres-Espin, Huie, Krukowski, Lee, Nolan, Guglielmetti, Hawkins, Chaumeil, Manley, Beattie, Bresnahan, Martone, Grethe, Rosi and Ferguson2021; NIH, 2018; Thompson et al., Reference Thompson, Vavilala and Rivara2015). Advancing the goals of these initiatives will require developing methods to harmonize data collected from different instruments evaluating the same constructs. The present study used a modern psychometric approach to link the most commonly used TBI symptom inventories in civilian and sport-related TBI research—the Rivermead Post Concussion Symptoms Questionnaire (RPQ; King et al., Reference King, Crawford, Wenden, Moss and Wade1995) and the Sport Concussion Assessment Tool (SCAT; McCrory et al., Reference McCrory, Meeuwisse, Aubry, Cantu, Dvorak, Echemendia, Engebretsen, Johnston, Kutcher, Raftery, Sills, Benson, Davis, Ellenbogen, Guskiewicz, Herring, Iverson, Jordan, Kissick and Turner2013, Reference McCrory, Meeuwisse, Dvorak, Aubry, Bailes, Broglio, Cantu, Cassidy, Echemendia, Castellani, Davis, Ellenbogen, Emery, Engebretsen, Feddermann-Demot, Giza, Guskiewicz, Herring, Iverson and & Vos2017).
TBI symptom inventories are frequently used to assess TBI outcomes in both clinical and research settings (McCrory et al., Reference McCrory, Meeuwisse, Dvorak, Aubry, Bailes, Broglio, Cantu, Cassidy, Echemendia, Castellani, Davis, Ellenbogen, Emery, Engebretsen, Feddermann-Demot, Giza, Guskiewicz, Herring, Iverson and & Vos2017). TBI symptoms measured soon after injury are robust predictors of later clinical outcomes (Mikolić et al., Reference Mikolić, van Klaveren, Groeniger, Wiegers, Lingsma, Zeldovich, von Steinbüchel, Maas, Roeters van Lennep and Polinder2021; Silverberg et al., Reference Silverberg, Gardner, Brubacher, Panenka, Li and Iverson2015) and, in the large mTBI population, are more prevalent and persisting than other TBI sequelae (e.g., functional limitations, cognitive impairment; Dikmen et al., Reference Dikmen, Machamer and Temkin2017). Furthermore, symptoms associated with TBI (e.g., depression, anxiety) appear to directly drive long-term disability (Zahniser et al., Reference Zahniser, Nelson, Dikmen, Machamer, Stein, Yuh, Manley and Temkin2019). Despite considerable overlap in item content, the RPQ and SCAT are typically used in the distinct subpopulations of TBI for which they were developed—civilian and athlete populations, respectively. Therefore, linking RPQ and SCAT scores in a mixed civilian and sport population will enable researchers to combine data across these diverse populations to fuel studies requiring larger sample sizes and to improve understanding of how TBI operates in these two subpopulations.
A variety of statistical methods exist for linking measures, from simpler approaches such as linear equating with observed scores, to more complex approaches such as item response theory (IRT; Fayers & Hays, Reference Fayers and Hays2014). Unlike regression-based approaches, which do not account for measurement error and may result in biased mappings (Lu et al., Reference Lu, Brazier and Ades2013), IRT approaches calibrate each scale onto the latent variable, thus accounting for measurement error and reducing potential bias (e.g., see Kaat et al., Reference Kaat, Blackwell, Estabrook, Burns, Petitclerc, Briggs-Gowan, Gershon, Cella, Perlman and Wakschlag2019). Therefore, the present study used fixed-parameter calibration IRT to allow for a single item calibration for all items, offering a more rigorously established cross-walk between the RPQ and SCAT (Choi et al., Reference Choi, Schalet, Cook and Cella2014).
Method
Participants/study design
Data (n = 397) were obtained through two prior prospective studies of mTBI conducted in Wisconsin. The research was completed in accordance with the Helsinki Declaration and all testing procedures were approved by the Medical College of Wisconsin Institutional Review Board. Of the n = 397 participants, 198 completed the RPQ and SCAT, 198 participants completed only the SCAT, and 1 participant with sport-related mTBI did not complete either. The 396 participants with at least one inventory completed were included in analyses.
Civilian trauma sample
The civilian trauma sample (n = 154; n = 75 mTBI and n = 79 orthopedic controls) was recruited from our institution’s level 1 trauma center inpatient trauma unit between April 2015 and March 2016 (details have been reported previously in Guzowski et al., Reference Guzowski, Hoelzle, McCrea and Nelson2021). Participants completed a bedside assessment at enrollment (median 2 days post-injury), which included the SCAT and RPQ. The study used the American Congress of Rehabilitation Medicine’s (ACRM) definition of mTBI: “A traumatically-induced physiological disruption of brain function, manifested by at least one of the following: loss of consciousness (LOC; <30 min), memory loss for the events before or after (post-traumatic amnesia, PTA) injury (<24 hr PTA), other evidence of alteration of mental state immediately post-injury; or documentation of focal neurologic deficit after trauma; as well as initial Glasgow Coma Scale (GCS) score > 13.” All civilian mTBI participants met the athlete study definition of mTBI; civilian acute characteristics (e.g., posttraumatic amnesia, LOC) leaned toward more severe mTBI than experienced by athletes. Inclusion criteria for all participants were 18 years of age or older, English speaking, and admitted to the trauma service within the past 10 days for an eligible mTBI or other traumatic injury. Exclusion criteria for all participants were being in police custody or unable to independently provide informed consent. Orthopedic controls were required to have no evidence or report of head trauma.
Sport-related mTBI sample
The current project includes data from a longitudinal study of sport-related mTBI (concussion) in high school and collegiate athletes (details have been reported previously in Guzowski et al., Reference Guzowski, Hoelzle, McCrea and Nelson2021). Male football players with concussion (n = 105; 1 dropped due to missing outcome data) and non-injured athlete controls (n = 138) were assessed during the sport season with the SCAT symptom checklist and RPQ. (Post-concussion assessments were performed at 24–48 hr post-injury.) Adult athletes and parents of minor athletes completed written informed consent prior to assessment.
mTBI was diagnosed by licensed athletic trainers; injuries met the definition of concussion adopted from the Center of Disease Control and Prevention HEADS UP educational initiative: “An injury resulting from a forceful bump, blow, or jolt to the head that results in rapid movement of the head and causes a change in the athlete’s behavior, thinking, physical functioning, or the following symptoms: headache, nausea, blurred vision, memory difficulty, and difficulty concentrating.” Inclusion criteria at pre-season baseline enrollment encompassed participating in football, 14 years of age or older, English speaking, and capable of granting informed consent or assent. Exclusion criteria for post-injury follow-up (or follow-up as a non-injured control) encompassed contradictions to completing additional procedures for the parent study (i.e., neuroimaging, blood draws), current primary psychiatric disorder, current use of prescribed narcotics, history of or suspicion for significant neurological conditions (i.e., epilepsy, stroke, dementia), history of moderate or severe TBI, and history of concussion within the 6 months prior to the pre-season baseline exam.
Measures
Rivermead post concussion symptoms questionnaire (RPQ)
The RPQ is a self-report measure comprising 16 mTBI-related symptoms rated on a 5-point Likert scale (0 = not experienced at all, 4 = a severe problem). Participants are asked to report the degree to which symptoms have been problematic over the past 24 hr in comparison to their pre-injury symptom levels. As is recommended by the test author, scores of “1” (no more of a problem than pre-injury) were treated as “0” responses when computing total scores (range 0–64; e.g., see King et al., Reference King, Crawford, Wenden, Moss and Wade1995).
(Full RPQ: http://www.tbi-impact.org/cde/mod_templates/12_F_06_Rivermead.pdf)
Sport Concussion Assessment Tool (SCAT) symptom checklist
The SCAT (version 3/5) symptom checklist comprises 22 symptoms rated based on their current severity on a 7-point Likert Scale (0 = none, 6 = severe). Total symptom severity scores range from 0 to 132 points (e.g., see Echemendia et al., Reference Echemendia, Meeuwisse, McCrory, Davis, Putukian, Leddy, Makdissi, Sullivan, Broglio, Raftery, Schneider, Kissick, McCrea, Dvorak, Sills, Aubry, Engebretsen, Loosemore, Fuller and Herring2017).
(Full SCAT: https://bjsm.bmj.com/content/bjsports/early/2017/04/26/bjsports-2017-097506SCAT5.full.pdf)
Data analysis
Demographic information, descriptive statistics, and the internal consistency of the combined RPQ + SCAT were calculated using the psych package in R (Revelle & Revelle, Reference Revelle and Revelle2015). In addition, we evaluated several linking assumptions. One major assumption is that the measures being linked inherently assess the same construct (construct congruence; Dorans, Reference Dorans2007). To check this assumption, we examined the RPQ and SCAT for similar item content and response format. If two instruments measure the same construct, their scores should be highly correlated, both at the summed score level and at the item level. Bivariate correlations are a useful first step for establishing relationships at the summed score level. At the item level, construct congruence can be evaluated using factor analysis, which provides evidence as to whether an instrument (in our case the combined set of RPQ and SCAT items) is sufficiently unidimensional, to consider its content to measure the same construct. If there is a strong general factor of the combined items then there is evidence for sufficient unidimensionality, which indicates that a unidimensional IRT model will adequately reflect the item parameters and latent trait estimates of the general factor underlying the item responses (Reise et al., Reference Reise, Cook, Moore, Reise and Revicki2015). Our previous factor modeling studies on the SCAT and the RPQ have found both instruments to be sufficiently unidimensional as standalone instruments (Agtarap et al., 2020; Brett et al., Reference Brett, Kramer, McCrea, Broglio, McAllister and Nelson2020; Nelson et al., Reference Nelson, Kramer, Patrick and McCrea2018).
To investigate unidimensionality of the combined item set, we conducted exploratory factor analyses (EFA) and confirmatory factor analyses (CFA) of the 38 items in Mplus using the weighted least square mean and variance adjusted (WLSMV) appropriate for categorical data (7th edition; Mutheń & Mutheń, Reference Mutheń and Mutheń1998–2015). EFA was conducted to confirm that the ratio of the first to second eigenvalue indicated sufficient unidimensionality (commonly defined as a ratio > 4; Reeve et al., Reference Reeve, Hays, Bjorner, Cook, Crane, Teresi, Thissen, Revicki, Weiss, Hambleton, Lio, Gershon, Reise, Lai and Cella2007). Then, we evaluated the fit of a one-factor confirmatory factor analysis (CFA) model on the combined measure, considering adequate fit based on several conventions: comparative fit index (CFI) > .90, Tucker Lewis index (TLI) > .90, and root-mean-square-error of approximation (RMSEA) < .10 (Hopwood & Donnellan, Reference Hopwood and Donnellan2010; Lance et al., Reference Lance, Butts and Michels2006). Though these cut off scores are more lenient than some have recommended to declare good fit in factor modeling studies, they are generally considered adequate to establish sufficient unidimensionality for IRT modeling (e.g., see Choi et al., Reference Choi, Lim, Schalet, Kaat and Cella2021). In factor modeling research, others have advocated for applying more lenient cut scores and considering other evidence of model adequacy beyond formal fit statistics (see Hopwood & Donnellan, Reference Hopwood and Donnellan2010). Finally, we estimated two bifactor models to allow for potential multidimensionality of the combined measure and demonstrate evidence of an overarching general factor between the two measures. The first bifactor model contained the RPQ and SCAT (RPQ-SCAT bifactor) as the specific factors whereas the second (symptom subgroup bifactor) model grouped similar items from the RPQ and SCAT into established subgroups (emotional, cognitive, torpor, vision, sensory sensitivity, and headache) based on previously published bifactor analyses of the RPQ and SCAT, separately (e.g., see Agtarap et al., 2020; Nelson et al., Reference Nelson, Kramer, Patrick and McCrea2018). We used hierarchical omega (OmegaH) as a measure of fit for the bifactor models, which estimates the proportion of variance in total scores that can be attributed to the general factor, TBI-related symptom burden, and the proportion of unique variance that can be attributed to the specific factors (Rodriguez et al., Reference Rodriguez, Reise and Haviland2016). Typically, hierarchical omega values of .8 or greater suggest sufficient unidimensionality because the proportion of variance explained is highly attributable to the general factor. This essentially means the secondary factors have no meaningful influence on unidimensional IRT parameter estimates (Rodriguez et al., Reference Rodriguez, Reise and Haviland2016). Similar values can also be calculated for specific factors, which are expected to be low in the case of sufficient unidimensionality.
IRT linking methods also assume population invariance, meaning score differences between subgroups for one measure are similar to the score differences between the same subgroups on the second measure. The standardized root-mean-square deviation (RMSD), which is a weighted difference between the standardized difference of subpopulations (e.g., mTBI and control) across two measures, provides one method of quantifying these differences (Dorans & Holland, Reference Dorans and Holland2000). According to Dorans and Holland (Reference Dorans and Holland2000), population invariance can be assumed for RMSD values of less than .08. We computed the RMSD using the SEAsic package in R to evaluate population invariance by subgroups with and without mTBI.
We used a fixed-parameter IRT graded response model (GRM) via the PROsetta package in R (Choi et al., Reference Choi, Lim, Schalet, Kaat and Cella2021) to establish the parameters for the RPQ and SCAT. We used the GRM because it can accommodate scales with polytomous, ordinal responses. Via this approach, for each item a set of parameters are generated that estimate the extent to which each item indicates the latent variable (i.e., TBI symptom-related burden). Item parameters include item thresholds (also called b parameters or item difficulty parameters) and item discriminations (also called a parameters; Thomas, Reference Thomas2011). In the GRM, an item threshold indicates the level of the latent trait needed to have a 50% probability of responding above a given response option on an ordinal scale (see b1–b6 parameters in Supplemental Table 1; Thomas, Reference Thomas2011). An item discrimination parameter reflects the strength of the relationship between an item and the latent trait. Taken together, these item parameters define the likelihood an item is endorsed at all levels of the latent continuum. Finally, by charting these likelihoods as described in our previous work (Balsis et al., Reference Balsis, Benge, Lowe, Geraci and Doody2015), we were able to link the raw scores on these two important instruments.
Due to our data collection method in which a substantial proportion of participants completed both the RPQ and SCAT, we employed fixed-parameter IRT to link the RPQ and SCAT, which is a conceptually and computationally simple method that involves an initial calibration of one measure (considered the “anchor” measure) before combining all items into a single measure. In a second calibration, the item parameters for the anchor are fixed at their initial calibration values, while the parameters for the other measure are freely estimated (Choi et al., Reference Choi, Schalet, Cook and Cella2014). This yields parameters for the non-anchor (linked) measure that are on the same metric as the anchor measure. We treated the SCAT as the anchor, which allows us to use responses to the RPQ items to determine what a person would have scored on the SCAT. As a sensitivity analysis, we ran a second fixed parameter IRT using the RPQ as an anchor and confirmed it resulted in comparable results. Finally, after linking the instruments we compared the precision of IRT-based estimates of symptom severity for SCAT and RPQ linked scores by plotting the standard error of IRT estimates across the latent continuum of symptom severity, stratified by subgroup. Finding discrepancies in the accuracy of linked score estimates for sport and civilian populations may have implications for best practices and when linking instruments to combine sport and civilian datasets.
Results
Demographic information for the civilian and sport samples is presented in Table 1.
Note. mTBI, mild traumatic brain injury.
a Yes and suspected categories collapsed; no and unknown categories collapsed.
Overall, the sample was predominantly male (83.8%) and white (74.7%), and ranged widely in age (14–90, M [SD] = 30.2 [19.2]). As a measure of population invariance, the RMSD was calculated for head-injured (combining civilian and athlete mTBI) versus control groups (combining orthopedic injury and non-injured controls). The RMSD value was 8%, suggesting population invariance with respect to injury status.
An overview of item content (see Supplemental Tables) revealed significant overlap, including many identical or nearly identical items spanning somatic (e.g. headaches, photosensitivity, phonosensivity), cognitive (e.g. concentration and memory difficulties), and emotional complaints (e.g. irritability, low mood). The results of several analyses supported sufficient unidimensionality to treat the RPQ + SCAT combined item set as measuring a single construct. First, internal consistency reliability (coefficient alpha) was high (.94) for the RPQ + SCAT combined item set. Second, the first:second eigenvalue ratio from an EFA of the combined item set was consistent with sufficient unidimensionality (Eigenvalues 1–4 were 24.46, 2.65, 1.65, and 1.38, respectively). The one-factor EFA and CFA demonstrated good fit (χ 2[665] = 2,441, p < .001, CFI = .94, TLI = .93, RMSEA = .082. The RPQ-SCAT bifactor CFA and symptom subgroup bifactor CFA offered modestly improved fit over the unidimensional model (RPQ-SCAT bifactor model χ 2[627] = 2,055.15, p < .001, CFI = .95, TLI = .94, RMSEA = .076; symptom subgroup bifactor model χ 2[638] = 13,99.98, p < .001, CFI = .97, TLI = .97, RMSEA = .054). However, omega values for the bifactor models indicated that the general factor explains the vast majority of variance in total sores for the combined instrument (99% and 95% for the RPQ-SCAT and symptom subgroup bifactor models, respectively). These results corroborate our previous work that found sufficient unidimensionality of the construct as indexed by these inventories (see Agtarap et al., Reference Agtarap, Kramer, Campbell-Sills, Yuh, Mukherjee, Manley, McCrea, Dikmen, Giacino, Stein and Nelson2021 [RPQ]; Brett et al., Reference Brett, Kramer, McCrea, Broglio, McAllister and Nelson2020 [SCAT]; and Nelson et al., Reference Nelson, Kramer, Patrick and McCrea2018 [SCAT]), providing further evidence that the RPQ + SCAT combined instrument meets criteria for sufficient unidimensionality.
Table 2 presents the cross-walk table of SCAT and RPQ scores derived from the fixed-parameter calibration. Figure 1 illustrates how the model enables linking the SCAT and RPQ according to where their scores fall on the latent continuum of overall symptom severity. Item parameters for the IRT model of the combined SCAT + RPQ item set can be found in the Supplement.
Note. Scores linked through the latent symptom severity dimension (theta) using item response theory. The SCAT score that most closely corresponds with the RPQ score is presented first. Because the SCAT encompasses a larger range than the RPQ, the SCAT scores which correspond to one RPQ score are then presented in parentheses as a range.
Figure 2 depicts a scatterplot showing the relationship between SCAT scores estimated by the RPQ and the sample’s observed SCAT scores. Overall, observed SCAT scores and linked (from RPQ) SCAT scores were highly correlated (r = .92, p < .001), which supports the use of the crosswalk table. The mean difference between linked and observed SCAT scores was -0.74 with a standard deviation of 10.5.
Lastly, Figure 3 shows the relationship between IRT scale scores (interpreted on a Z-score metric with a mean of 0 and standard deviation of 1) and standard errors for the linked versions of the SCAT and RPQ, respectively, for each subgroup. For both measures, civilian mTBI and sport-related concussion patients show the widest range of scores (x-axis); non-injured athlete controls show the narrowest range, with most scores falling at or below average. Standard errors tend to be slightly higher for civilian mTBI patients and orthopedic controls, particularly for RPQ scores linked from the SCAT.
Discussion
The RPQ and SCAT are among the most widely used TBI symptom inventories in the civilian and sport mTBI populations, respectively. We demonstrated that the RPQ and SCAT measure the same construct and that their total scores can be linked through latent variable modeling techniques. The cross-walk table offered in this manuscript enables users of either inventory to convert scores from one scale to the other and thereby compare the myriad published findings reported from the two scales or combine datasets that used either RPQ or SCAT. There was a strong (r = .92) correlation between observed and linked SCAT scores, supporting the use of the IRT-based crosswalk table for obtaining scores on each measure. Future researchers can utilize these results to create larger and more diverse samples from existing datasets and harmonize future research on sport-related and civilian TBI, as well as allow for direct clinical comparison across groups.
Previous research reveals that civilians experience a high base rate of mTBI-like symptoms in the absence of injury (Iverson & Lange, Reference Iverson and Lange2003). Although the SCAT and RPQ have very similar item content, they are distinguished by different rating scales that theoretically makes the RPQ more accurate to measure mTBI-related symptoms in persons with more pre-injury symptoms. In particular, the SCAT solicits ratings of current symptom severity (presumably reflecting both pre-injury and injury-related symptoms), while the RPQ solicits ratings of injury-related symptoms. It is perhaps not surprising that standard errors of RPQ IRT scores linked from the SCAT appeared somewhat higher in the civilian subgroups within our sample, whereas SCAT IRT scores linked from the RPQ showed less variability in precision (standard error) across groups. Practically, this finding supports the self-evident, preferable practice to use RPQ scores to predict SCAT scores than vice versa, particularly when pre-injury symptoms are expected to be prevalent. However, overall the differences in measurement precision across the groups were minimal. Regarding outcomes, a clinician working with an athlete who completed the SCAT can consider the client’s score relative to the findings from the SCAT literature, but also now relative to the RPQ civilian literature. Taken together, this research has further clinical implications, as it provides strong evidence of construct congruence and allows for the direct comparison of mTBI symptoms across athlete, civilian, and control groups who completed a combination of these two symptom inventories.
A strength of the current study was our use of IRT, which linked RPQ and SCAT total scores based on their relationship to the latent dimension (TBI-related symptom burden) that drives scores on both measures. Other methods of linking, such as linear equating, assume scores are linearly related and reflect the underlying construct without measurement error. These assumptions may potentially lead to biased mapping. Therefore, IRT linking methods have demonstrated more reliability and precision than other methods of linking, such as Deming regression (e.g., see Kaat et al., Reference Kaat, Blackwell, Estabrook, Burns, Petitclerc, Briggs-Gowan, Gershon, Cella, Perlman and Wakschlag2019) and equipercentile linking (e.g., see Choi et al., Reference Choi, Schalet, Cook and Cella2014). While several IRT-based linking methods are available, we chose fixed-parameter calibration, a method made possible by the fact that many participants in the current sample completed both the RPQ and SCAT (Dorans, Reference Dorans2007). To our knowledge, one other study has linked the RPQ and the SCAT. Langer et al. (Reference Langer, Comper, Ruttan, Saverino, Alavinia, Inness, Kam, Lawrence, Tam, Chandra, Foster and Bayley2021) used linear equating methods to create an equation that allows for summed score conversion between the RPQ and SCAT. Linear equating may be biased by large score distribution differences (Muraki et al., Reference Muraki, Hombo and Lee2000), such as the distribution difference seen between the RPQ and SCAT. Therefore, our IRT analysis may offer a more accurate conversion. Furthermore, the current study provides a simple-to-use score conversion table, requiring no further computations for researchers or clinicians.
Some limitations of the current study must be addressed. One potential limitation is a relatively modest sample size for IRT analysis, which may have led to reduced accuracy of parameter estimates and thus have affected the linking table. Though, on balance, this sample is one of the largest samples of its type, making it perhaps one of the best available from which to conduct a linking analysis. Second, although our population invariance analysis supported the combination of injured and non-injured groups, invariance among other subgroups cannot be confirmed in this sample. Future studies using even larger datasets will need to investigate this issue. Furthermore, our sample demographics may not be perfectly representative of the entire TBI population, but was more diverse than many TBI samples (e.g., those restricted to sport-related mTBI). While our predominantly white (75%), male (84%) sample may limit the generalizability of this research to other groups, there is no strong evidence to indicate that IRT parameters would be biased by sample demographics (e.g., we have previously reported strict measurement invariance of the RPQ factor structure across age, gender, race, and other groups; Agtarap et al., Reference Agtarap, Kramer, Campbell-Sills, Yuh, Mukherjee, Manley, McCrea, Dikmen, Giacino, Stein and Nelson2021). Lastly, as is required for this linking method, rounding decisions were made to produce the crosswalk table. However, our sensitivity analysis utilizing the RPQ as an anchor (instead of the reported results using SCAT as the anchor) produced a nearly identical cross walk table, indicating minimal error introduced by such methodological decisions.
The main purpose of this analysis was to link the total scores of the RPQ and SCAT. We focused on the total scores due to their clinical and research relevance, as total scores are most defensible given the psychometric properties of these two instruments (e.g., see Agatrap et al., 2020; and Nelson et al., Reference Nelson, Kramer, Patrick and McCrea2018) and are the most frequently used in both clinical and research settings. Given the similarity in item content across the two scales, future research could consider linking scores from comparable domains evaluated across both inventories.
In conclusion, we used fixed-parameter IRT calibration in order to produce a cross-walk table linking the summed scores on the RPQ and SCAT. This research provides the opportunity for future researchers to compare findings across the published studies that used these popular mTBI symptom checklists and combine datasets spanning populations to further our understanding of TBI symptom overlap across subpopulations.
Funding and Disclosure Statement
This secondary data analysis project was funded by the National Institute of Neurological Disorders and Stroke grant # R01 NS110856. The original studies were supported by the Defense Health Program under the Department of Defense Broad Agency Announcement for Extramural Medical Research through Award No. W81XWH-14-1-0561, and the Research and Education Program Fund, a component of the Advancing a Healthier Wisconsin (AHW) endowment of the Medical College of Wisconsin. The study REDCap databases were supported by the National Center for Advancing Translational Sciences, National Institutes of Health (NIH), through Grant Numbers 8UL1TR000055 and 1UL1- RR031973 (-01). Opinions, interpretations, conclusions and recommendations are those of the authors and are not necessarily endorsed by the NIH, Department of Defense, or AHW. The authors report no disclosures relevant to the manuscript.