Introduction
Palliative care is a holistic model of care that focuses on the alleviation of suffering and improving quality of life by meeting the physical, social, psychological, cultural, and spiritual needs of people with chronic and life-limiting illnesses and their families (World Health Organization 2015). Palliative care involves multiple aspects including communication, end-of-life (EOL) decisions, care before and after death, information about social services, and cultural competency (Cha et al. Reference Cha, Lee and Lee2020; Hahn and Cadogan Reference Hahn and Cadogan2011). Self-efficacy is defined as “people’s judgments of their capabilities to organize and execute courses of action required to attain designated types of performances” (Bandura Reference Bandura1986, 391). Since self-efficacy is a domain-specific concept, palliative care self-efficacy refers to an individual’s belief in his or her own ability to be competent in providing palliative care (Adriaansen et al. Reference Adriaansen, Van Achterberg and Borm2005).
Individuals with intellectual and developmental disability (IDD) demonstrate limitations in cognitive functioning and practical life functioning skills (including activities of daily living, occupational skills, money management, and social skills) relatively early in life (American Association of Intellectual and Developmental Disabilities 2023), As individuals with IDD are living longer (Coppus Reference Coppus2013; World Health Organization and The World Bank 2011), more are experiencing chronic health conditions, and the potential need for palliative care services increases (García-Domínguez et al. Reference García-Domínguez, Navas and Verdugo2020; Kinnear et al. Reference Kinnear, Morrison and Allan2018).
Staff serving people with IDD have demonstrated gaps in knowledge and efficacy in providing palliative care (Cartlidge and Read Reference Cartlidge and Read2010; Fahey-McCarthy et al. Reference Fahey-McCarthy, McCarron and Connaire2009; Ng and Li Reference Ng and Li2003; Ryan et al. Reference Ryan, McEvoy and Guerin2010). There is a need for staff to develop a wide range of skill sets, encompassing communication, recognizing spiritual, social, and emotional needs and cultural competence, understanding legal issues, and bereavement care (Friedman et al. Reference Friedman, Choueiri and Gilmore2008; Lord et al. Reference Lord, Field and Smith2017; Reynolds et al. Reference Reynolds, Guerin and McEvoy2008; Stein Reference Stein2008; Watters et al. Reference Watters, McKenzie and Wright2012; Wiese et al. Reference Wiese, Stancliffe and Dew2014). To our knowledge, no instruments exist that measure IDD staff’s self-efficacy in provision of palliative care. A variety of instruments have been used to measure the self-efficacy and knowledge of health professionals providing palliative care (Frey et al. Reference Frey, Gott and Banfield2011; Friesen and Andersen Reference Friesen and Andersen2019; Karacsony et al. Reference Karacsony, Chang and Johnson2015). These include instruments for physicians (Fox Reference Fox2007; Nakazawa et al. Reference Nakazawa, Miyashita and Morita2009), medical students and residents (Billings et al. Reference Billings, Randall and Engelberg2009; Buss et al. Reference Buss, Alexander and Switzer2005; Mulder et al. Reference Mulder, Bleijenberg and Verhagen2009), nurses (Desbiens and Fillon Reference Desbiens and Fillon2011; Dobie et al. Reference Dobie, Plumb and Shepherd2016; Edwards et al. Reference Edwards, Hardin-Pierce and Anderson2020; Lange et al. Reference Lange, Mager and Greiner2011; Moura Minosso et al. Reference Moura Minosso, Martins and De Campos Oliveira2017; Nakazawa et al. Reference Nakazawa, Miyashita and Morita2009; Phillips et al. Reference Phillips, Salamonson and Davidson2011; Shipman et al. Reference Shipman, Burt and Ream2008; Wilkinson et al. Reference Wilkinson, Perry and Blanchard2008), care assistants (Dryden and Addicott Reference Dryden and Addicott2009; Fox Reference Fox2007; Jenkins et al. Reference Jenkins, Alberry and Daniel2010; Phillips et al. Reference Phillips, Salamonson and Davidson2011; Resnick Reference Resnick, Galik and Pretzer-Aboff2008), and other health professionals (Jenkins et al. Reference Jenkins, Alberry and Daniel2010; Lange et al. Reference Lange, Mager and Greiner2011).
Palliative care self-efficacy instrument for intellectual and developmental disability staff
We created a palliative care self-efficacy instrument to assess the level of confidence of IDD staff regarding palliative care provision. We call it Palliative Care Self-Efficacy Instrument for Intellectual and Developmental Disability Staff (PCSE-IDD). The instrument was used in evaluating the effect of an online training program on palliative care self-efficacy of IDD staff but can be used separately as well to gauge workers’ palliative care self-efficacy cross-sectionally. The online palliative care training developed by the authors was delivered using Internet-connected tablets. The 2-hour training included modules on the overview of palliative care, legal and ethical issues, cultural diversity and competency, communication with people with IDD, symptom management, EOL care and logistics after death, bereavement and grief of people with IDD, and staff grief and coping strategies (Kim and Gray Reference Kim and Gray2021). Training content and the palliative care self-efficacy instrument items were based on literature from the IDD and non-IDD fields (Adriaansen et al. Reference Adriaansen, Van Achterberg and Borm2005; Bekkema et al. Reference Bekkema, de Veer and Hertogh2015; Desbiens and Fillon Reference Desbiens and Fillon2011; Fahey-McCarthy et al. Reference Fahey-McCarthy, McCarron and Connaire2009; Fox Reference Fox2007; Hahn and Cadogan Reference Hahn and Cadogan2011; Kirkendall and Waldrop Reference Kirkendall and Waldrop2013; McCarron et al. Reference McCarron, McCallion and Fahey-McCarthy2010; McEvoy et al. Reference McEvoy, MacHale and Tierney2012; Nakazawa et al. Reference Nakazawa, Miyashita and Morita2009; Ng and Li Reference Ng and Li2003; Phillips et al. Reference Phillips, Salamonson and Davidson2011; Ryan et al. Reference Ryan, Guerin and Dodd2011a, Reference Ryan, Guerin and Dodd2011b, Reference Ryan, McEvoy and Guerin2010; Stein Reference Stein2008; Tuffrey-Wijne et al. Reference Tuffrey-Wijne, Hogg and Curfs2007; Wark et al. Reference Wark, Hussain and Edwards2014; Wittenberg-Lyles et al. Reference Wittenberg-Lyles, Goldsmith and Ferrell2014) as well as the results of need assessments with IDD staff (Gray and Kim Reference Gray and Kim2020; Kim and Gray Reference Kim and Gray2018). The effect of the training was evaluated using one-group pretest–posttest design (Kim and Gray Reference Kim and Gray2021).
Palliative care self-efficacy was measured using a question “How confident are you regarding the following?” for 11 items with 5 Likert-style responses of not at all (1), slightly (2), somewhat (3), moderately (4), and a lot (5). The self-efficacy questions reflect the subjective nature of the concept, which relies on the perception of the participants. The 11 items addressed response to ethical problems, advance directives, communicating with people with different background, informing bad news to people with IDD, recognizing pain, nonmedical pain management, patient care before death, post-death tasks, identifying grieving behavior of people with IDD, helping people with IDD recover after a loss, and managing own grief (Table 1).
Rasch modeling
The Rasch model is a psychometric model that shows what should be expected in response to test or questionnaire items if the outcome measure from the test or questionnaire is an interval measure (Tennant and Conaghan Reference Tennant and Conaghan2007). The Rasch model incorporates a method for ordering persons (e.g., from a sample of IDD staff) according to their ability (e.g., where they stand in terms of palliative care self-efficacy) and ordering items according to their difficulty (e.g., which level of palliative care self-efficacy each item represents) (Bond et al. Reference Bond, Yan and Heene2021, 11).
Rasch modeling supplements classical test theory and overcome limitations of classical test theory approaches (Tennant and Conaghan Reference Tennant and Conaghan2007). First, using Rasch models, individual persons and items can be examined while classical test theory focuses on the summary of items (e.g., sum or average of response scores) (Anselmi et al. Reference Anselmi, Vidotto and Bettinardi2015). For example, when a paired t-test or effect size statistics are used to measure change by an intervention, only the overall change of the sample can be assessed. Information at the individual level, such as who responded to the intervention better, is not available in classical test theory (Anselmi et al. Reference Anselmi, Vidotto and Bettinardi2015). Second, Rasch analysis enables the transformation of a raw ordinal score into a linear, interval-level variable if ordinal data fit to Rasch model expectations (Chang et al. Reference Chang, Wang and Tang2014). It is problematic to treat raw scores and values assigned to ordinal response categories as interval data without validating that the measure is interval (Grimby et al. Reference Grimby, Tennant and Tesio2012). The linear transformation in the Rasch model allows for valid use of mathematical operations and parametric analysis and provides a truer depiction of a trait (Chang et al. Reference Chang, Wang and Tang2014). Third, in measuring change over time, Rasch models ensure the invariance of the instrument across time points (Bond et al. Reference Bond, Yan and Heene2021, 203). Additionally, Rasch models provide a variety of psychometric tools and information that enable examination of key properties of composite measures including dimensionality, rating scale structure, differential item functioning (DIF), person and item separation and reliability, and targeting (Tennant and Conaghan Reference Tennant and Conaghan2007).
Rasch models to measure change
Rasch models provide solutions to challenges in measuring change. The primary interest of evaluation research is to determine whether an intervention improved persons’ ability. Persons are expected to show changes between Time 1 and Time 2 because of an intervention. The challenge is how to measure persons and items in a common frame of reference encompassing different time points so that the measurement of change has an unambiguous numerical representation and a substantive meaning (Anselmi et al. Reference Anselmi, Vidotto and Bettinardi2015). This can be accomplished by estimating a Rasch model on stacked data. Stacked analysis produces a set of item measures that are consistent at both time points and a person measure for each individual at each time point (Anselmi et al. Reference Anselmi, Vidotto and Bettinardi2015). Stacking data is done by appending the observations for the post intervention below the baseline (before intervention) observations (Figure 1). As a result, the data file contains twice as many cases as the persons being measured. In addition to measuring persons and items in a common frame of reference encompassing 2 time points, stacked analysis allows for tracking differential impacts of an intervention on persons (Wright Reference Wright2003).
This study aimed to evaluate the psychometric properties of a newly created palliative care self-efficacy instrument developed for staff who care for people with IDD using Rasch analysis and assess the change in palliative care self-efficacy between 2 time points using Rasch analysis of stacked data.
Methods
Participants and procedures
Participants were recruited from 4 nonprofit community-based organizations serving people with IDD in a US Midwestern state. Informed consent was obtained before conducting the training and associated evaluations. The university institutional review board reviewed the research protocol, including the consent process (ORC #HS17-0189). The majority of the participants were female (89%) and White (58%) or African American (35%). Approximately 37% were in rural areas. On average, participants were 39 years old (20–67) and worked for 10 years in the field (1–30).
Before taking the training, participants answered questions about palliative care knowledge and palliative care self-efficacy as well as demographic and work-related information. Participants took a posttest about palliative care knowledge upon completing the training. One month after the training, participants were invited via email to complete an online follow-up survey that included items on palliative care self-efficacy. Palliative care self-efficacy was measured 1 month post training instead of upon completing the training to allow participants to have time to apply what they have learned in the training to their work for a month. Among 132 training participants, 98 answered questions on palliative care self-efficacy at both baseline and 1-month follow-up. Participants who completed both the baseline and follow-up survey (n = 98) had a higher education level than those who completed the baseline survey only (n = 34) (p < 0.005, Fisher’s exact test), although 2 groups were similar in other characteristics.
Analysis
This study used baseline (n = 98) and follow-up data (n = 98) on palliative care self-efficacy stacked together. The sample size 98 was enough for Rasch modeling. For stable results, the sample size needs to be at least 6 multiplied by the number of items (Mundfrom et al. Reference Mundfrom, Shaw and Ke2005), which would be a minimum of 66 persons, considering that our instrument comprised 11 items.
Data were analyzed using WINSTEPS computer software version 4.8.0.0 (Linacre Reference Linacre2021b). The rating scale model was used because all items had the same number of response categories and category values (Bond et al. Reference Bond, Yan and Heene2021, 97). Rasch analysis was performed to examine rating scale structure, unidimensionality, local independence, overall model fit, person and item reliability and separation, targeting, individual item and personal fit, DIF, and change in palliative care self-efficacy between 2 time points.
Before the main analysis, local dependency across time points was investigated to determine whether use of stacked analysis was appropriate. If the person measures estimated in the stacked analysis with anchors are not significantly different from those estimated in the stacked analysis, the effect of local dependency is considered negligible (Anselmi et al. Reference Anselmi, Vidotto and Bettinardi2015). In the stacked analysis with anchors, the data from Time 2 were analyzed first to obtain Rasch measures. Then, the data from Time 1 were analyzed by anchoring the item measures to the estimates from the analysis with Time 2 data. There was no evidence of significant local dependency according to t statistics. The t statistic comparing the person measures obtained in the stacked analysis with and without anchors (df = 20 by df = 2k – 2, where k is the number of items) was not significant for any person. When there is no evidence of significant local dependency, measures from either approach (i.e., anchored or nonanchored) can be used (Anselmi et al. Reference Anselmi, Vidotto and Bettinardi2015). We used the stacked analysis without anchors in this study.
Results
Rating scale structure
Analysis of the instrument with original 5 response categories indicated that the response category structure was not optimal. The instrument did not meet the criterion of having at least 10 observations per response category (Linacre Reference Linacre2002). There were fewer than 10 observations per response category in 10 of 11 items specifically in category 1 or 2. As presented in the left section of Table 2, the spacing between Andrich thresholds of categories 1 and 2 (first threshold) and categories 2 and 3 (second threshold) was only 0.45 logits, which was less than the minimum of 1.40 logits (Bond et al. Reference Bond, Yan and Heene2021, 226). It indicated that categories 2 and 3 were too similar or close to be separate categories. The observed average measures and thresholds were in linear order, and fit statistics were within the acceptable range between 0.60 and 1.40.
MNSQ, mean square residual statistic.
Visual examination suggested similar pictures. Figure 2 displays the category characteristic curves for the instrument with 5 response categories (Figure 2a) and 3 response categories (Figure 2b). The curves represent the probability of respondents selecting any particular category for the difference between any person ability and item difficulty estimates (Bond et al. Reference Bond, Yan and Heene2021, 226). The x axis represents person ability levels relative to the item’s difficulty in logits (Bond et al. Reference Bond, Yan and Heene2021, 112). The y axis represents the expected probability of selecting any given category. For the original 5 response categories (Figure 2a), the curves for category 2 and category 3 did not display a distinct peak. The highest probability of category 2 and category 3 was less than 0.5, which is the minimum probability peak value for a category to be considered functional (Bond et al. Reference Bond, Yan and Heene2021, 112). It indicated that 2 adjacent categories (category 2 and category 3) represented too narrow intervals on the latent trait, and one of the response categories was rarely chosen by respondents (Linacre Reference Linacre2002). These results together pointed to the need to collapse categories 1 to 3 into one category.
The revised instrument with 3 response categories (minimally, moderately, and a lot) indicated improvement to the original instrument. All 3 criteria for rating scale functioning (Linacre Reference Linacre2002) were met: (1) there were more than 10 observations per rating category, (2) measures increased linearly with each category, and (3) mean square residual statistics were between 0.60 and 1.40. The right section of Table 2 presents the structure calibrations for a revised instrument with 3 response categories. Both observed average measures and thresholds were in linear order, and the infit and outfit mean square residual statistics were between 0.60 and 1.40. The spacing between thresholds was greater than 1.40 logits. Figure 2b demonstrates an alternative where original response categories 1, 2, and 3 have been collapsed into one, resulting in 3 response categories. The category characteristic curves (Figure 2b) look like a range of hills with visible peaks for each response category, which is expected in a scale with ordered thresholds and adequate spacing between thresholds (Bond et al. Reference Bond, Yan and Heene2021, 226; Linacre Reference Linacre2021a, 375). Since the instrument with 3 response categories showed an optimal response category structure, results from this point will be about analysis of the instrument with 3 response categories.
Unidimensionality, local independence, and model fit
Principal components analysis of Rasch residuals indicated that the Rasch dimension explained 42.3% of the total variance with an eigenvalue of 8.06. The variance in the first contrast had an eigenvalue of 1.97, suggesting that the residuals were random noise, given it was less than 2.0 (Bond et al. Reference Bond, Yan and Heene2021, 255; Linacre Reference Linacre2021a, 421; Raîche Reference Raîche2005). There was no meaningful pattern in the residual contrast plot either. The variance of the first contrast accounted for 10.3% of the total variance, which was much smaller than the variance explained by item difficulties (19.7%). Standardized residual correlations of a pair of items were small, ranging from 0.18 to 0.41, indicating negligible local dependency (Linacre Reference Linacre2021a, 436). These results supported the assumption of unidimensionality and local independency. Model fit statistics indicated an excellent fit of the data to the Rasch model. Infit and outfit mean square residuals were 1.00 and 1.01, with small model standard error 0.13. Standardized Z scores were −0.10 and 0.00 for infit and outfit, respectively.
Person and item reliability and separation
The instrument demonstrated good person and item reliability and separation. The Rasch person reliability (equivalent to Cronbach’s alpha) was 0.83, and the item reliability statistic 0.95 indicated a wide range of item difficulties (Bond et al. Reference Bond, Yan and Heene2021, 41). Person and item separation indices suggested adequate spread of persons and items across the trait continuum. The person separation index was 2.17, and the item separation index was 4.49. The number of statistically significant strata was 2.23 for persons and 6.32 for items, indicating that persons in the sample could be separated into approximately 2 distinct groups, and items in the instrument into approximately 6 distinct groups of trait levels (Wright and Masters Reference Wright and Masters1982, 92).
Targeting
The mean person location was zero logits, indicating a well-targeted measure in terms of mean (Tennant and Conaghan Reference Tennant and Conaghan2007). The person-item threshold map, however, suggested ceiling and floor effects. Figure 3 is the person-item threshold map that illustrates the distribution of person locations and items’ Andrich threshold locations. The upper chart in blue shows the distribution of person locations along the construct (x axis). The lower chart in red shows Andrich thresholds of the items along the same x axis. Though person locations extended from −5.0 to 5.0 logits, there were gaps at the top and bottom of the construct hierarchy in terms of item threshold locations. The ceiling effect was more prominent than the floor effect as the gap between the highest person location (the positive end of the x axis) and the highest threshold location was larger than the gap between the lowest person location (the negative end of the x axis) and the lowest threshold location.
Item and person fit
All individual items and most persons showed good fit. Table 3 provides information on item locations in misfit order. Infit and outfit mean square residual statistics of all 11 items were within fit criteria (0.6 < MNSQ < 1.40) (Wright and Linacre Reference Wright and Linacre1994). For person fit, 5 and 4 out of 98 persons had infit and/or outfit mean square residual statistics greater than 2 at Time 1 and Time 2, respectively (results not shown). Mean square residual statistics values greater than 2 for a particular individual may indicate that the person is not a typical member of the population or that the person answered the questionnaire inaccurately (Anselmi et al. Reference Anselmi, Vidotto and Bettinardi2015; Wright and Linacre Reference Wright and Linacre1994). Summary fit statistics of 184 non-extreme persons were in acceptable ranges (infit MNSQ 1.00 and outfit MNSQ 1.01) (results not shown).
a Item location in logits.
MNSQ, mean square residual statistic; ZSTD, standardized Z-score.
Differential item functioning
One item (managing your own experience of grief) had DIF contrast 0.93 (p < 0.05) between genders. It suggested that women tended to endorse 0.93 logit higher on the item compared to men, when their overall palliative care self-efficacy ability levels were taken into account. Three items (communicating with people with different background, recognizing different types of pain, and reducing pain with nonmedical techniques) had DIF contrast 0.84 (p < 0.05), 0.54 (p < 0.05), and 0.73 (p < 0.005), respectively, between tenure groups. It indicated that IDD staff who worked longer in the field tended to endorse higher on the respective items compared to those who worked shorter in the field when their overall palliative care self-efficacy ability levels were considered.
Change over time
There was a significant improvement in palliative care self-efficacy among 98 training participants overall. The mean Rasch measure increased from −0.40 to 0.39 logits between Time 1 and Time 2 (p < 0.005 in a t-test). The effect size was between medium and large indicated by Cohen’s d 6.15 (Maher et al. Reference Maher, Markey and Ebert-May2013). The scatterplot comparing item difficulties between 2 time points indicated 1 item far from the identity line x = y (Figure 4). Item 3 was significantly more difficult at Time 2 (−1.09) than at Time 1 (−1.89), given item 3 is above the solid line signifying the upper limit of a 95% confidence interval. According to the scatterplot comparing person abilities between 2 time points (indicated by id numbers), 12 out of 98 individuals demonstrated a significant improvement in palliative care self-efficacy, while palliative care self-efficacy of 9 individuals decreased (Figure 5). A moderate correlation between the measures at 2 time points (r = 0.64) indicated the variation in the intervention’s effect on participants. Individuals above the solid curved line over the identity line (the upper limit of a 95% confidence interval) are those who showed a significant improvement, while the individuals below the solid curved line under the identity line (the lower limit of a 95% confidence interval) are those whose palliative care self-efficacy decreased between Time 1 and Time 2.
The solid curved lines next to the dotted line indicate 95% confidence intervals.
Discussion
This study examined the psychometric properties of a palliative care self-efficacy instrument developed for staff who care for people with IDD and the change in palliative care self-efficacy between 2 time points using Rasch analysis. The analysis provided detailed information on how well the instrument items could measure palliative care self-efficacy, the positive and less desirable elements of the instrument, and the change in palliative care self-efficacy in terms of items and persons.
The analysis demonstrated that overall, the instrument performed adequately as a measure of palliative care self-efficacy of IDD staff. Principal components analysis of the Rasch residuals indicated unidimensionality of the instrument, and standardized residual correlations of item pairs demonstrated negligible local dependency. Overall fit statistics indicated an excellent fit of the data. Person and item reliability was good, and person and item separation indices suggested adequate spread of persons and items across the trait continuum. All individual items and most persons in the sample demonstrated good fit. Fit statistics of all 11 items were within the adequate fit range. Approximately 5% of the sample had person mean square residual statistics greater than 2, indicating that they answered the questionnaire inaccurately or might not be a typical member of the population (Anselmi et al. Reference Anselmi, Vidotto and Bettinardi2015; Wright and Linacre Reference Wright and Linacre1994).
The results, however, indicated issues in rating scale structure, targeting, and DIF. Response patterns to the original 5 response categories were not optimal. Although the observed average measures and thresholds were in linear order and fit statistics were within the acceptable range, some response categories had fewer than 10 observations and the distance between 2 thresholds was less than the acceptable range. Once the response categories were collapsed to 3 categories, the instrument became effective as a rating scale measure.
The ceiling and floor effects indicate the need for improvement. By adding items of more difficulties and less difficulties, the instrument will be able to cover the whole range of person abilities shown in this sample. More specific items regarding “reducing pain with nonmedical techniques,” “caring for patients before they die,” and “what to do when a client dies” may fill the gap at the top of the construct hierarchy, and more specific items regarding “communicating with people with different background” and “recognizing different kinds of pain” may fill the gap at the bottom of the construct hierarchy.
Gender-related DIF was found in 1 item and work tenure–related DIF was found in 3 items. Women outperforming men with a comparable level of palliative care self-efficacy in managing own experience of grief may be related to gender differences in social support and coping styles, which may affect how men and women approach grief and bereavement (Stroebe et al. Reference Stroebe, Stroebe and Schut2001). IDD staff who worked longer in the field may have outperformed those with a comparable level of palliative care self-efficacy but with shorter tenure in communicating with people with different background, recognizing different types of pain, and reducing pain with nonmedical techniques, as the former group has gained more work experience regarding these topics. DIF found in this sample, however, informative, should not determine whether the items should be modified or eliminated from the instrument at this point (Du and Yates Reference Du and Yates1995). Since a significance test with 1 sample cannot differentiate between real DIF and an accident, whether the items continue to exhibit DIF in the same way should be investigated with additional samples (Du and Yates Reference Du and Yates1995), as is standard statistical analysis protocol.
By estimating a Rasch model on stacked data, persons and items were measured in a common frame of reference encompassing different time points. The stacked analysis provided the information not only on the overall change in palliative care self-efficacy but also on the changes in individual item difficulties and person abilities between the 2 time points. The fact that item 3 was significantly more difficult at Time 2 than at Time 1 may suggest the unexpected impact of the training. The participants may have overestimated their ability to communicate with people from different backgrounds at baseline, then reevaluated it after learning more about communication and cultural diversity in the training. There was variation in the effect of the intervention on different participants, among whom palliative care self-efficacy improved, stayed the same, and worsened through the training.
A few limitations of this study must be considered. Participants were recruited from nonprofit IDD service agencies in a US Midwestern state, which prevents us from generalizing our results to the entire IDD staff in the US. Our sample, however, was diverse in terms of age, race, length of work experience, and rural–urban/suburban location. While a sample size of 98 participants at both time points was sufficient for Rasch analysis, a larger sample size would have been preferable.
Conclusions
Rasch analysis was a useful tool for measuring the psychometric properties of the PCSE-IDD. The use of Rasch analysis has many advantages over other analysis methods, including generalizability across samples and items, true transformation of ordinal to interval-level variables, distinguishing unusual respondents and items that are not working, ensuring the invariance of the instrument across time points, and enabling examination of dimensionality, rating scale structure, and DIF (Anselmi et al. Reference Anselmi, Vidotto and Bettinardi2015; Bond et al. Reference Bond, Yan and Heene2021, 203; Tennant and Conaghan Reference Tennant and Conaghan2007). This instrument is unique in that it is targeted for IDD staff and it covers self-efficacy about palliative care provision for people with IDD as well as about self-care of IDD staff. The results of this study point to the need to improve the instrument by using 3 response categories (such as, minimally, moderately, and a lot), adding more items of higher- and lower-level difficulties and testing the modified instrument with other samples.
Acknowledgments
The authors thank study participants for their contributions.
Conflicts of interest
The authors declare none.