The majority of patients with mental health (MH) conditions are assessed and treated in primary care (Hunter et al., Reference Anker, Duncan and Sparks2009; Petterson et al., Reference Arroll, Goodyear-Smith, Crengle, Gunn, Kerse, Fishman, Falloon and Hatcher2014). With the emergence of integrated behavioral health (BH), brief screening measures are increasingly used to identify patients with MH problems and assist in interdisciplinary clinical decisions to improve patient care. Such screening tools typically focus on MH symptomatology associated with one disorder. Given that depression is the most prevalent MH condition with the majority of depressed adults receiving treatment in primary care (Bland, Reference Bland2004; Edlund et al., Reference Blasinsky, Goldman and Unützer2004), routine depression screening has been recommended by the US Preventive Services Task Force (US Preventive Services Task Force, Reference Bringhurst, Watson, Miller and Duncan2002; Reference Campbell and Hemsley2009; Siu and US Preventive Services Task Force, Reference Duncan, Sparks, Miller, Bohanske and Claud2016). The US Department of Health and Human Services, Health Resources and Services Division has also recently required depression screening for patients seen in Federally Qualified Health Centers (FQHCs).
The first adult depression screen designed for primary care was the Patient Health Questionnaire-9 (PHQ-9) (Kroenke et al., Reference Duncan2001), which was followed by a briefer PHQ-2 (Kroenke et al., Reference Duncan2003). Both the PHQ-2 and PHQ-9 have demonstrated high sensitivity and specificity for detecting major depression in primary care (Arroll et al., Reference Duncan and Reese2010) and a two-step approach to primary care depression screening is often utilized to improve diagnostic accuracy and assess severity. However, in a retrospective chart review at a university hospital-based family medicine clinic with integrated behavioral health providers (BHPs), only 5% of 200 family medicine patients with a positive PHQ-2 score were administered a PHQ-9, with physicians reporting competing demands, time constraints, and prior knowledge of their patient’s depression status as reasons for not administering a follow-up PHQ-9 (Fuchs et al., Reference Edlund, Unützer and Wells2015). Thus, despite the validity of the PHQ-9 and its wide use in research trials, real world workflow demands challenge the extent to which it is being routinely used in primary care settings (Blasinsky et al., Reference Fuchs, Haradhvala, Hubley, Nash, Keller, Ashley, Weisberg and Uebelacker2006).
Besides the feasibility concerns of a stepped-method screening approach, symptom-based screening tools like the PHQ-9 are potentially limited by their diagnostic focus. They tend not to identify the broad array of MH distress (eg, relational, social) that bring patients to primary care that likely influence both emotional and physical health, nor can they detect common MH comorbidities. Thus, a singular use of many of these traditional screening measures might not identify a number of patients suffering from other MH symptoms or other life problems who could benefit from BH consultation. Although the use of multiple screening measures might solve the under-identification problem, comprehensive assessment is not practical given the workflow demands of primary care.
A brief global distress measure with a broad focus on life functioning may offer an alternative. The Outcome Rating Scale (ORS) (Miller and Duncan, Reference Gilbody, Sheldon and House2000) is one of two measures comprising the Partners for Change Outcome Management System (PCOMS) (Duncan, Reference Gillaspy and Murphy2012; Duncan and Reese, Reference Hunter, Goodie, Oordt and Dobmeyer2015). The ORS is an ultra-brief, validated visual analogue, self-report measure of a patient’s perceived level of global distress and functioning across four life domains: individual, interpersonal, social, and overall. PCOMS was originally developed and researched as a feasible clinical and outcome system for specialty MH settings and is included in the Substance Abuse and Mental Health Administration’s (SAMHSA) National Registry of Evidence-Based Programs and Practices (NREPP). Although the ORS measures and tracks patient change in MH and substance abuse services, it has never been investigated as a primary care BH screener.
The principle aim of this preliminary study was to investigate if a single measure of global distress in four life functioning domains could serve as a universal primary care screener. To serve this purpose, this exploratory study compared the ORS with the PHQ-9 and PHQ-2, evaluating their correlations, reliability coefficients, and the number of patients who screened positive for potential BH consultation. We hypothesized that the ORS, which takes a more comprehensive picture of functioning beyond symptoms, would classify a higher percentage of patients who may benefit from consultation.
Method
Setting
This study was conducted at three small rural family practice health centers, associated with Peak Vista Community Health Centers, a large FQHC in Colorado. Two integrated BHPs provide BH services to almost 4000 patients empaneled to this study’s three rural clinics.
Participants
Of the 3962 total registered patients to the three rural health centers, ~90% were Caucasian, 8% Hispanic, and 46% of all patients were at or below 200% of the federal poverty level ($11 880 for individuals, $16 020 for a family of two, $24 300 for a family of four, etc.); 2879 patients were 18 years of age or older, the ultimate pool of participants for this study. A total of 426 adults (14.8%) of this pool completed the PHQ-9 and ORS on presentation to their medical providers. There were 297 women and 129 men with an average age of 46 (age range: 18–82 years, SD=14.78).
Measures
The PHQ-9 is a nine-item depression scale from the Primary care evaluation of mental disorders (PRIME-MD) diagnostic instrument for common mental disorders, frequently used in primary care (Kroenke et al., Reference Duncan2001). Internal reliability of the PHQ-9 is strong (α=0.89), and a recommended score of 10 or higher has an 88% sensitivity and 88% specificity for major depression (Kroenke and Spitzer, Reference Kroenke and Spitzer2002). The PHQ-2 consists of the first two items of the PHQ-9 and for our investigation, we used a PHQ-2 clinical cut-off score of 3 or greater, which has an 83% sensitivity and 90% specificity for major depression (Kroenke and Spitzer, Reference Kroenke and Spitzer2002).
The ORS (Miller and Duncan, Reference Gilbody, Sheldon and House2000) assesses four dimensions: (1) individual – personal or symptomatic distress or well-being, (2) interpersonal – relational or family distress, (3) social – the patient’s social role functioning, that is, work/school and non-familial relationships, and (4) overall – a big picture perspective or general sense of well-being (see Figure 1). These four dimensions are translated into a visual analogue format of four 10-cm lines where patients place a mark on each line with low scores to the left and high to the right. The score is the summation of the marks made by the patient to the nearest millimeter on each of the four lines, measured by a centimeter ruler, template, or web system. On the basis of over 400 000 administrations of the ORS and confirming earlier calculations (Miller et al., Reference Kroenke, Spitzer and Williams2003), Duncan (Reference Kroenke, Spitzer and Williams2014) reported the clinical cut-off for adults as a total score of 25. Adults scoring under 25 are reporting distress typical of individuals receiving psychotherapy, psychotropic medication, or both, and those scoring above 25 are scoring typical of persons who are not receiving treatment. Rated at a Flesch–Kincaid Grade Level 5 and translated into 24 languages, the ORS is easily understood by patients from a variety of different cultures and has immediate connectivity to a patient’s life functioning (Duncan, Reference Gillaspy and Murphy2012).
Multiple validation studies of the ORS (Miller et al., Reference Kroenke, Spitzer and Williams2003; Bringhurst et al., Reference Löwe, Kroenke, Herzog and Gräfe2006; Campbell and Hemsley, Reference Miller and Duncan2009; Reese et al., Reference Miller, Duncan, Brown, Sparks and Claud2012) as well as efficacy studies have found that the ORS generates reliable scores. Coefficient αs have ranged from 0.87 to 0.91 in validation studies and from 0.82 (individual therapy) (Reese et al., Reference Petterson, Miller, Payne-Murphy and Phillips2009) to 0.92 (group therapy) (Slone et al., Reference Reese, Norsworthy and Rowlands2015) in clinical studies. Concurrent validity of the ORS has found moderately strong correlations with other validated measures (Miller et al., Reference Kroenke, Spitzer and Williams2003; Bringhurst et al., Reference Löwe, Kroenke, Herzog and Gräfe2006; Campbell and Hemsley, Reference Miller and Duncan2009; Gillaspy and Murphy, Reference Reese, Toland and Kodet2011).
Procedure
The PHQ-9 and the ORS were completed by adult patients on a double-sided sheet when they presented for their primary care health appointment. The measures were introduced, administered, and scored by either medical assistants upon rooming the patient for their medical vitals or front desk staff in the waiting rooms. The measures were only given when an integrated BHP was in clinic so that such screening would provide appropriate BHP back-up and not interfere with workflow demands. Positive screens on the PHQ-9 (total score of 10 or greater) and ORS (total score <25) were reported by the medical assistants to the medical provider, who then had the option of consulting the BHP.
Results
Mean scores for all patients on the ORS (M=26.79, SD=10.02), PHQ-9 (M=6.66, SD=6.19), and PHQ-2 (M=1.43, SD=1.69) were below the respective clinical cut-offs. Coefficient αs for scores on the ORS, PHQ-9, and PHQ-2 were 0.92, 0.89, and 0.81, respectively. Bivariate correlations between the ORS and the PHQ-9 and PHQ-2 were 0.72 and 0.70, respectively. Both coefficients offer evidence of concurrent validity for the ORS. We evaluated the number of patients who were classified in the clinical range on each of the instruments. There was moderate agreement between the ORS and PHQ-9 according to κ=0.56 (P<0.001), 95% confidence interval (CI 0.48, 0.64); the percentage of agreement was 78.64. There was also moderate agreement between the ORS and PHQ-2, κ=0.48 (P<0.001), 95% CI (0.40, 0.56); percentage of agreement was 77. We also conducted a McNemar test given that we had paired nominal-level data to compute if the proportion of patients who scored in the clinical range differed on the two measures. The ORS categorized significantly more patients in the clinical range than either the PHQ-9 χ 2 (df=1, n=426)=19.78, P<0.001 or the PHQ-2 χ 2 (df=1, n=426)=47.18, P<0.001 (see Table 1).
**<0.001.
Discussion
This preliminary study compared well-validated primary care depression screens (PHQ-9; PHQ-2) with an ultra-brief, four-item global measure of distress across major life domains (ORS) within three family practice, FQHCs. The ORS had never before been investigated in primary care as a universal screener and this investigation explored its capacity to do so in comparison with the PHQ-9. The ORS had robust correlations with the PHQ-9 and PHQ-2, comparable internal consistency, and categorized patients similarly overall. In addition, the ORS classified significantly more patients in the clinical range for potential BH consultation. Although preliminary, these results suggest that an ultra-brief measure of distress across life functioning that also covers the whole developmental age spectrum (Duncan et al., Reference Reese, Toland, Slone and Nosworthy2006) may cast a wider net and offer a viable alternative to the limitations of traditional symptom-based and diagnostic-specific primary care BH screeners. While we believe engaging more patients in BH intervention to be a positive step to improve patient outcomes, there may be drawbacks including more demand for BHPs and additional workflow concerns.
A possible concern is the internal consistency estimate of 0.92 for the ORS, indicating potential redundancy (Steiner, Reference Shuman, Slone, Reese and Duncan2003). This is likely, in part, due to the high correlation between the last item ‘overall’ and the first item ‘individually’ (Campbell and Hemsley, Reference Miller and Duncan2009). Although this indicates psychometric redundancy, we believe this concern is mitigated given the inclusion of the last item was for clinical purposes (Duncan, Reference Gillaspy and Murphy2012) and reflects a balance between being psychometrically sound and clinically useful.
There are several limitations to this exploratory investigation. Although the ORS demonstrated initial evidence of concurrent validity as indicated by the strong correlation coefficients, other aspects of validity compared with the PHQ-9 were not measured, nor was the ORS’s sensitivity and specificity tested. This will be addressed in a follow-up study. Another weakness of this study was that it did not systematically address feasibility (ie, number of patients not screened, number refused, reasons for refusal, impact on clinical schedule or staff workload), nor did we collect data on the number of BH consults triggered by a positive ORS or PHQ-9 and their follow-up outcomes. This too will be addressed in a follow-up study. We believe, however, that the four-item ORS strikes a feasibility balance between the PHQ-2 alone and either a stepped assessment or the PHQ-9 alone. A third weakness was that the patient sample was composed of primarily rural white, female, low-income adults, and our findings may not generalize to other populations. Lastly, the screening measures were not universally nor randomly administered, possibly affecting the study’s results.
The ORS, part of the SAMHSA designated evidence-based practice for psychotherapy, PCOMS, may also offer integrated BH care a feasible outcome measure for short-term BH treatment. While identifying patients with psychosocial distress impacting their health and well-being is an important function of primary care screening tools, their use as a quality improvement intervention has not been demonstrated. The PHQ-9, for example, has been shown to be a valid tool for monitoring clinical change over time (Löwe et al., Reference Siu2004), but has not been empirically demonstrated to improve patient outcomes (Gilbody et al., Reference Slone, Reese, Mathews-Duvall and Kodet2008; Fuchs et al., Reference Edlund, Unützer and Wells2015). The PCOMS feedback intervention has been demonstrated to improve patient outcomes in five randomized clinical trials (Anker et al., Reference Steiner2009; Reese et al., Reference Petterson, Miller, Payne-Murphy and Phillips2009; 2010; Shuman et al., 2015; Slone et al., Reference Reese, Norsworthy and Rowlands2015). Only future research can determine whether these benefits extend to primary care BH intervention.
Acknowledgements
The authors would like to express their appreciation to Peak Vista Community Health Centers, its medical leadership, and the clinic support team of the Health Centers involved in this pilot study.