Introduction
In 2020, approximately 50 million people worldwide were living with dementia (Livingston et al., Reference Livingston, Huntley, Sommerlad, Ames, Ballard, Banerjee, Brayne, Burns, Cohen-Mansfield, Cooper, Costafreda, Dias, Fox, Gitlin, Howard, Kales, Kivimäki, Larson, Ogunniyi, Orgeta, Ritchie, Rockwood, Sampson, Samus, Schneider, Selbæk, Teri and Mukadam2020). These numbers are expected to increase to 82 million in 2030 and to 152 million in 2050 (Patterson, Reference Patterson2018). Up to 36% of people living with dementia suffer from sleep disturbances (Garcia-Alberca et al., Reference Garcia-Alberca, Lara, Cruz, Garrido, Gris and Barbancho2013; Webster et al., Reference Webster, Costafreda Gonzalez, Stringer, Lineham, Budgett, Kyle, Barber and Livingston2020a; Wilfling et al., Reference Wilfling, Dichter, Trutschel and Kopke2019). Many health conditions are associated with sleep disturbances (Fung et al., Reference Fung, Vitiello, Alessi, Kuchel and Committee2016), such as depression, disinhibition and aberrant motor behavior (Garcia-Alberca et al., Reference Garcia-Alberca, Lara, Cruz, Garrido, Gris and Barbancho2013). Wakefulness at night and longer rapid-eye-movement sleep latencies are associated with poorer cognitive performance (Moe et al., Reference Moe, Vitiello, Larsen and Prinz1995), physical complaints, respiratory disabilities, poor self-reported health (Foley et al., Reference Foley, Monjan, Brown, Simonsick, Wallace and Blazer1995) and mortality (Gehrman et al., Reference Gehrman, Marler, Martin, Shochat, Corey-Bloom and Ancoli-Israel2004). People living with dementia in nursing homes have reported that disturbed sleep is often associated with restlessness and pondering (Dörner et al., Reference Dörner, Hüsken, Schmüdderich, Dinand, Dichter and Halek2023). This finding is consistent with the experience of nurses who characterize disturbed sleep of people living with dementia mainly by behavioral and psychological symptoms (Dörner et al., Reference Dörner, Hüsken, Schmüdderich, Dinand, Dichter and Halek2023; Webster et al., Reference Webster, Powell, Costafreda and Livingston2020b). The day after disturbed sleep, people living with dementia experience poor wellbeing, confusion, reduced cognitive- and physical performance and exhaustion. In contrast, good sleep during the day is characterized by feeling well, improved cognition, full physical ability, increased interaction and being in balance (Dörner et al., Reference Dörner, Hüsken, Schmüdderich, Dinand, Dichter and Halek2023). Approximately four out of five nurses have reported regularly observing sleep disturbances among people living with dementia in nursing homes (Wilfling et al., Reference Wilfling, Dichter, Trutschel and Köpke2020a).
Previous systematic reviews examined pharmacological interventions against sleep disturbances in people with Alzheimer’s disease across all settings (McCleery et al., Reference McCleery, Cohen and Sharpley2014) and nonpharmacological interventions in nursing home residents (Wilfling et al., Reference Wilfling, Hylla, Berg, Meyer, Köpke, Halek, Möhler and Dichter2020b). The primary studies included in those reviews used a variety of sleep-related outcome measurements to assess sleep variables or detect sleep disturbances, such as length of sleep. These measures included self-reported measures (n = 4) (Gattinger et al., Reference Gattinger, Hantikainen, Ott and Stark2017; Kuck et al., Reference Kuck, Pantke and Flick2014; Serfaty et al., Reference Serfaty, Kennell-Webb, Warner, Blizard and Raven2002), proxy-reported measures (n = 5) (Alessi et al., Reference Alessi, Martin, Webber, Cynthia Kim, Harker and Josephson2005; Alessi et al., Reference Alessi, Yoon, Schnelle, Al-Samarrai and Cruise1999; Gattinger et al., Reference Gattinger, Hantikainen, Ott and Stark2017; Kuck et al., Reference Kuck, Pantke and Flick2014; Schnelle et al., Reference Schnelle, Alessi, Al-Samarrai, Fricker and Ouslander1999; Serfaty et al., Reference Serfaty, Kennell-Webb, Warner, Blizard and Raven2002; Singer et al., Reference Singer, Tractenberg, Kaye, Schafer, Gamst, Grundman, Thomas and Thal2003) and technological devices (n = 3) (Alessi et al., Reference Alessi, Martin, Webber, Cynthia Kim, Harker and Josephson2005; Alessi et al., Reference Alessi, Yoon, Schnelle, Al-Samarrai and Cruise1999; Camargos et al., Reference Camargos, Louzada, Quintas, Naves, Louzada and Nóbrega2014; Dowling et al., Reference Dowling, Burr, Van Someren, Hubbard, Luxenberg, Mastick and Cooper2008; Gattinger et al., Reference Gattinger, Hantikainen, Ott and Stark2017; Kuck et al., Reference Kuck, Pantke and Flick2014; Li et al., Reference Li, Grandner, Chang, Jungquist and Porock2017; NCT00325728, 2008; Richards et al., Reference Richards, Lambert, Beck, Bliwise, Evans, Kalra, Kleban, Lorenz, Rose, Gooneratne and Sullivan2011; Schnelle et al., Reference Schnelle, Alessi, Al-Samarrai, Fricker and Ouslander1999; Serfaty et al., Reference Serfaty, Kennell-Webb, Warner, Blizard and Raven2002; Singer et al., Reference Singer, Tractenberg, Kaye, Schafer, Gamst, Grundman, Thomas and Thal2003). None of the measures used in those primary studies were developed specifically for people living with dementia and application in the nursing home setting. Therefore, it is unclear how appropriate these sleep-related measurements are for measuring sleep disturbances among people with dementia in the nursing home setting. To date, no systematic review has examined sleep-related measurements to assess sleep disturbances in relation to their psychometric properties for people living with dementia in nursing homes based on established consensus-based guidelines.
Therefore, the aims of this systematic review were as follows:
-
1. to identify sleep-related measurements to assess sleep disturbances that were developed for people living with dementia or that have been applied in this population,
-
2. to describe the theoretical basis, scope, domains, and extent of user involvement during the development process of sleep-related measurements to assess sleep disturbances;
-
3. to evaluate the reliability, validity and feasibility of the identified sleep-related measurements to assess sleep disturbances; and
-
4. to recommend sleep-related measurements to assess sleep disturbances among people living with dementia in nursing homes.
Methods
Design
This review is based on the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) initiative for systematic reviews of patient-reported outcome measures (Mokkink et al., Reference Mokkink, de Vet, Prinsen, Patrick, Alonso, Bouter and Terwee2018; Prinsen et al., Reference Prinsen, Mokkink, Bouter, Alonso, Patrick, de Vet and Terwee2018). This systematic review was reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines for systematic reviews (Page et al., Reference Page, Boutron, Hoffmann, Mulrow, Shamseer, Tetzlaff, Akl, Brennan, Chou, Glanville, Grimshaw, Hróbjartsson, Lalu, Li, Loder, Mayo-Wilson, McDonald, McGuinness, Stewart, Thomas, Tricco, Welch, Whiting and Moher2021).
Literature search
A systematic search was performed in September 2019 and updated in March 2024 without any restrictions regarding publication date. The search strategy (Appendix A1) was developed iteratively based on the Population, Intervention, Comparison, Outcome (PICO) framework (Straus et al., Reference Straus, Glasziou, Scott Richardson and Haynes2018). We initially used Google Scholar to perform an open search of the abovementioned databases, thus helping us to develop the syntax. Then, we systematically searched the PubMed, CINAHL and PsycINFO databases. Within the analyses, the first author conducted backward citation tracking of the included studies to obtain additional eligible studies. If a reference was not available, the authors and journals were contacted to ask for access. If sleep-related measurements were not sufficiently described in the identified studies, the authors were contacted for further information.
Study selection
Included studies had to be primarily focused on the development or psychometric evaluation of sleep-related measurements. The target group in our review was people diagnosed with dementia or possible dementia. Studies were also included if only a part of the target population had a possible dementia or was diagnosed with dementia. Studies without a dementia population were also included via backward citation tracking if they described the theoretical basis and the development of a sleep-related measurement. There were no restrictions regarding the care setting, thus enabling us to examine a wide range of sleep-related measurements among people living with dementia. Only studies published in English or German were included. We excluded studies that examined sleep-related measurements that cannot be applied in nursing home care because they require extraordinarily high requirements, for instance, in terms of space-, personnel- or specific technical requirements that are not usually available (e.g., polysomnography). Appendix A2 provides an overview of the eligibility criteria. Two reviewers (JD, MND, KW) independently performed the study selection.
Data extraction
The data extraction was performed in two steps: (1) full-text analyses and data extraction were performed by one reviewer (JD), and (2) an independent cross-check of all extracted data and their accuracy was performed by a second reviewer (KW, MND). Any discrepancies were resolved by discussion between the reviewers or by consulting a third reviewer.
Synthesis and methodological quality of the extracted data
All data regarding the theoretical background and development of measurements, the characteristics of the measurements, the application of technological devices and the psychometric properties of the measurements were entered into standardized tables. Feasibility was analyzed based on recommendations from the literature in the following eight domains: acceptability, demand, implementation, practicality, adaptation, integration, expansion and limited-efficacy testing (Bowen et al., Reference Bowen, Kreuter, Spring, Cofta-Woerpel, Linnan, Weiner, Bakken, Kaplan, Squiers, Fabrizio and Fernandez2009). Guidelines from the COSMIN initiative (Prinsen et al., Reference Prinsen, Mokkink, Bouter, Alonso, Patrick, de Vet and Terwee2018) were used to assess internal consistency, test-retest reliability, construct validity and criterion validity. Additionally, the interrater reliability was assessed with the QAREL tool (Lucas et al., Reference Lucas, Macaskill, Irwig and Bogduk2010). One reviewer (JD) rated the quality of the studies (see Table 1), and the results were then cross-checked and discussed with a second reviewer (MND). Actigraphy results were descriptively analyzed and later discussed based on recommendations from the literature (Camargos et al., Reference Camargos, Louzada and Nóbrega2013).
Results
Description of included studies
The systematic search conducted in 2019 identified n = 3552 records from the three databases. After removing duplicates, n = 2642 studies remained. Our updated search was performed at the end of March 2024 and yielded n = 1617 additional studies (thereof n = 370 duplicates). In total, n = 3889 studies were subjected to title and abstract screening. In the second step, n = 53 full-text articles were checked for eligibility. No measurement was excluded because it requires extraordinary high requirements that are not usually available in nursing homes. Ultimately, n = 11 records were included for data extraction. Three additional records were retrieved through backward citation tracking, and n = 1 paper was retrieved by contacting the author of an included paper who subsequently coauthored an additional psychometric manuscript which is published by now. Finally, n = 15 studies were identified that investigated n = 8 different measurements (Figure 1).
Sleep-related measurements assessing sleep disturbances
The n = 15 studies assessed three self-administered measurements: the Epworth Sleepiness Scale (ESS) (Johns, Reference Johns1991), the Alternative Version of the Epworth Sleepiness Scale (ESS-ALT) (Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021) and the Pittsburgh Sleep Quality Index (PSQI) (Buysse et al., Reference Buysse, Reynolds, Monk, Berman and Kupfer1989). Moreover, three proxy-administered measurements were investigated in the included studies: the Observational Sleep Assessment Instrument (OSAI) (Cohen-Mansfield et al., Reference Cohen-Mansfield, Werner and Marx1989), the Sleep Continuity Scale in Alzheimer’s Disease (SCADS) (Manni et al., Reference Manni, Sinforiani, Zucchella, Terzaghi and Rezzani2013) and the Sleep Disorders Inventory (SDI) (Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003). In addition, three studies investigated actigraphy (Ancoli-Israel et al., Reference Ancoli-Israel, Clopton, Klauber, Fell and Mason1997; Gibson and Gander, 2019; Van Someren, Reference Van Someren2007), and one study investigated a wrist monitoring system (Nijhof et al., Reference Nijhof, van Gemert-Pijnen, de Jong, Ankoné and Seydel2012). The theoretical basis and characteristics of each measurement are presented in Tables 1 and 2. A description of the actigraphy characteristics is presented in Table 3. Detailed results of the psychometric characteristics and the reasons for the evaluation of methodological quality are shown in Table 4.
*For items in detail of the measurements see Appendix A4.
§The summed product score was not calculated within the measurement development paper but in several later publications, e.g.: Tewary, S., Cook, N., Pandya, N., & McCurry, S. M. (2018). Pilot test of a six-week group delivery caregiver training program to reduce sleep disturbances among older adults with dementia (Innovative practice). Dementia (London), 17(2), 234-243. doi:10.1177/1471301216643191; Wilfling, D., Dichter, M. N., Trutschel, D., & Köpke, S. (2019). Prevalence of Sleep Disturbances in German Nursing Home Residents with Dementia: A Multicenter Cross-Sectional Study. J Alzheimers Dis, 69(1), 227-236. doi:10.3233/jad-180784.
*Cole RJ, Kripke DF, Gruen W, Mullaney DJ, Gillin JC. Automatic sleep/wake identification from wrist activity. Sleep. 1992 Oct;15(5):461-9. doi: 10.1093/sleep/15.5.461. PMID: 1455130.
*PLWD = people living with dementia.
** = Not applicable.
*** = Area under curve.
Theoretical basis of sleep-related measurements
Three out of six studies that examined the development of measurements used clinical experts as a source during the development process (Buysse et al., Reference Buysse, Reynolds, Monk, Berman and Kupfer1989; Cohen-Mansfield et al., Reference Cohen-Mansfield, Werner and Marx1989; Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021). Other studies used previous literature as a source (Buysse et al., Reference Buysse, Reynolds, Monk, Berman and Kupfer1989; Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021; Johns, Reference Johns1991; Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003). Moreover, one study included nurses, patients and relatives in the development process (Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021). Two studies did not report the sources they used during the measurement development process (Cohen-Mansfield et al., Reference Cohen-Mansfield, Werner and Marx1989; Sinforiani et al., Reference Sinforiani, Terzaghi, Pasotti, Zucchella, Zambrelli and Manni2007). Two measurements were developed to assess symptoms of sleep disturbances during both nighttime and daytime (Buysse et al., Reference Buysse, Reynolds, Monk, Berman and Kupfer1989; Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003); one measurement solely assessed sleep disturbances at night (Cohen-Mansfield et al., Reference Cohen-Mansfield, Werner and Marx1989); two measurements solely assessed daytime sleepiness (Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021; Johns, Reference Johns1991); and one measurement remains unclear (Sinforiani et al., Reference Sinforiani, Terzaghi, Pasotti, Zucchella, Zambrelli and Manni2007).
Risk of bias
Among the studies that included people living with dementia, the risk of bias was assessed for each self- and proxy-administered measurement by investigating the validity or reliability (Appendix A3).
The SCADS (Manni et al., Reference Manni, Sinforiani, Zucchella, Terzaghi and Rezzani2013) was found to have an adequate risk of bias for structural validity because only an exploratory factor analysis was performed.
The ESS-ALT (Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021), SCADS (Manni et al., Reference Manni, Sinforiani, Zucchella, Terzaghi and Rezzani2013) and SDI (Hjetland et al., Reference Hjetland, Nordhus, Pallesen, Cummings, Tractenberg, Thun, Kolberg and Flo2020) were found to have very good risk of bias for internal consistency, as the Cronbach’s alpha coefficients were calculated for each unidimensional scale. The PSQI (Curcio et al., Reference Curcio, Tempesta, Scarlata, Marzano, Moroni, Rossini, Ferrara and De Gennaro2013) was found to have inadequate internal consistency, as the Cronbach’s alpha coefficient was not calculated for each subdomain.
The PSQI (Curcio et al., Reference Curcio, Tempesta, Scarlata, Marzano, Moroni, Rossini, Ferrara and De Gennaro2013) was found to have inadequate cross-cultural validity because the samples were not similar for relevant characteristics and because the subsamples were small.
The PSQI (Curcio et al., Reference Curcio, Tempesta, Scarlata, Marzano, Moroni, Rossini, Ferrara and De Gennaro2013) was found to have a very good risk of bias for criterion validity because the area under the curve, sensitivity and specificity were calculated. However, the SDI (Hjetland et al., Reference Hjetland, Nordhus, Pallesen, Cummings, Tractenberg, Thun, Kolberg and Flo2020; Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003) and OSAI (Cohen-Mansfield et al., Reference Cohen-Mansfield, Werner and Marx1989) were found to have inadequate risk of bias for criterion validity because their reference standards were not consistent with international diagnostic criteria. Moreover, one study that assessed the criterion validity of the SDI (Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003) did not report the area under the curve, sensitivity or specificity, thus leading to the inadequate rating. The degree of hypothesis testing was rated as very good for the PSQI because the construct described by the comparator measurement is clear. The OSAI (Cohen-Mansfield et al., Reference Cohen-Mansfield, Werner and Marx1989) and SDI (Hjetland et al., Reference Hjetland, Nordhus, Pallesen, Cummings, Tractenberg, Thun, Kolberg and Flo2020; Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003) were found to have inadequate ratings because the construct of the comparator measurement was not clear; furthermore, unsatisfactory information about the measurement properties of the comparator measurement were presented in each study.
The SDI (Hjetland et al., Reference Hjetland, Nordhus, Pallesen, Cummings, Tractenberg, Thun, Kolberg and Flo2020; Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003) was found to have an inadequate risk of bias for responsiveness in two studies because the quality of the comparator measurement was insufficient.
Description of measurements
Epworth Sleepiness Scale (ESS). The ESS is a self-administered measurement for measuring daytime sleepiness – in particular, it assesses the nature and occurrence of daytime sleep. It was developed for adult patients in the hospital setting. It was constructed based on the literature and clinical expertise. The scale includes eight items with no subdomains (Johns, Reference Johns1991). Its feasibility was tested in n = 433 geriatric patients, of whom n = 192 were people living with dementia. In total, only 36% of the patients were able to complete the measurement (Frohnhofen et al., Reference Frohnhofen, Popp, Willmann, Heuer and Firat2009). The internal consistency of the ESS was found to be insufficient in a study of geriatric patients (n = 52, including n = 14 people living with dementia) in a hospital setting (Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021).
Epworth Sleepiness Scale – Alternative Version (ESS-ALT). The ESS-ALT is a self-administered modified version of the ESS for people with physical or mental disorders. This measurement can be completed by relatives and was developed in a hospital with geriatric patients. Clinical sleep experts, nurses, researchers and patients were included in the development process, and literature was used as well. In total, the scale has eight items with no subdomains, and five of the eight items were adapted from the ESS (Appendix A4). The feasibility, reliability and validity of this measurement were tested in n = 52 participants (including n = 14 people living with dementia) in the geriatric department of a hospital (Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021). Patients or relatives needed no support to answer the items. The internal consistency of the ESS-ALT was judged as insufficient.
Pittsburgh Sleep Quality Index (PSQI). The PSQI is a self-administered measurement that assesses a wide range of sleep disturbances and was developed for inpatients and outpatients of a psychiatric clinic. It was constructed based on clinical expertise, a literature review and 18 months of field testing. The questionnaire contains 19 items subdivided into seven subdomains (Buysse et al., Reference Buysse, Reynolds, Monk, Berman and Kupfer1989). One study (Curcio et al., Reference Curcio, Tempesta, Scarlata, Marzano, Moroni, Rossini, Ferrara and De Gennaro2013) was conducted to assess the criterion validity of the PSQI, which was found to be indeterminate based on the criteria for good measurement properties. The reliability was also rated as indeterminate.
Observational Sleep Assessment Instrument (OSAI). The OSAI is a proxy-administered measurement for sleep disturbances (Cohen-Mansfield et al., Reference Cohen-Mansfield, Werner and Marx1989). It was developed specifically for residents in the nursing home setting. Sources for the development process were not reported. The exact proportion of people living with dementia in the sample was not reported. The OSAI included 17 items across two subdomains: sleep and sleep patterns. The interrater reliability of the OSAI was unclear based on the QAREL. The criterion validity of this measurement was rated as sufficient.
Sleep Continuity Scale in Alzheimer’s Disease (SCADS). The SCADS (Manni et al., Reference Manni, Sinforiani, Zucchella, Terzaghi and Rezzani2013) is a proxy-administered measurement that is completed by relatives or caregivers who usually share the bedroom with the person whose sleep will be assessed. It consists of selected items from the “Questionnaire for Hallucinations and Sleep-Wake Cycle in Alzheimer’s Disease” (Sinforiani et al., Reference Sinforiani, Terzaghi, Pasotti, Zucchella, Zambrelli and Manni2007), which was developed for outpatients living with dementia in a hospital. This original measurement consists of 46 items. The measurement aims to investigate the relationship between hallucinations and the sleep-wake cycle of people living with Alzheimer’s disease. The theoretical assumption is that a relation exists between the physiopathogenesis and sleep disturbances when visual hallucinations are present in neurodegenerative diseases. Descriptions of the involved professionals, patients or other sources were not reported (Sinforiani et al., Reference Sinforiani, Terzaghi, Pasotti, Zucchella, Zambrelli and Manni2007). The exact proportion of people living with dementia in the sample was also not reported. The SCADS (Manni et al., Reference Manni, Sinforiani, Zucchella, Terzaghi and Rezzani2013) includes nine items. The validity was rated as insufficient and reliability as indeterminate. In another investigation, the authors reported that in terms of feasibility, the SCADS is a feasible measurement that is rapid, easy to complete and suitable for people living with dementia (Manni et al., Reference Manni, Sinforiani, Terzaghi, Rezzani and Zucchella2015). However, as a limitation within this publication, it was mentioned that the SCADS cannot be administered to people living with severe dementia or people living with dementia with any relative or caregiver.
Sleep Disorders Inventory (SDI). The Sleep Disorders Inventory is a proxy-rated measurement (Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003). It was developed based on the Neuropsychiatric Inventory (NPI) (Cummings et al., Reference Cummings, Mega, Gray, Rosenberg-Thompson, Carusi and Gornbein1994) in research centers for Alzheimer’s disease with people living with Alzheimer’s dementia. The SDI aims to assess and quantify sleep disturbances and sleep disorders in people living with Alzheimer’s disease. Sleep disturbances were defined as less than 6 hours of total sleep time at night. The NPI (Cummings et al., Reference Cummings, Mega, Gray, Rosenberg-Thompson, Carusi and Gornbein1994) was used as a source to develop the SDI, but no further information was reported. The measurement consists of eight items that are not divided into subdomains (Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003). The psychometric properties of the SDI were judged as insufficient. In another study (Hjetland et al., Reference Hjetland, Nordhus, Pallesen, Cummings, Tractenberg, Thun, Kolberg and Flo2020), the validity of the SDI was rated sufficient, and its reliability was rated as indeterminate.
Technological devices. In total, four studies (Ancoli-Israel et al., Reference Ancoli-Israel, Clopton, Klauber, Fell and Mason1997; Gibson and Gander, 2019; Nijhof et al., Reference Nijhof, van Gemert-Pijnen, de Jong, Ankoné and Seydel2012; Van Someren, Reference Van Someren2007) investigated the use of technological devices for the measurement of sleep disturbances. Actigraphs were all worn on the wrist. Ancoli-Israel and colleagues (Ancoli-Israel et al., Reference Ancoli-Israel, Clopton, Klauber, Fell and Mason1997) compared actigraphy and polysomnography, and the interrater reliability was 0.87. The criterion validity parameters with observations as a comparison measure were 87% (sensitivity) and 90% (specificity). Gibson et al., (Gibson and Gander, Reference Gibson and Gander2019) compared the interrater reliability of actigraphy vs. diaries to detect sleep epochs and found an overall agreement of 82% for sleep epochs and an overall agreement of 67% for wake epochs. In the same study, a sensitivity of 87% for sleep and 77% for wake and a specificity of 80% for sleep and 61% for wake were obtained. Another study examined the feasibility of wrist-worn measurements (Nijhof et al., Reference Nijhof, van Gemert-Pijnen, de Jong, Ankoné and Seydel2012). The most advantages are that nurses in a nursing home were able to see the data in the monitoring system to determine whether the sleep behavior of participants was good or not; furthermore, the computer system that generated the data was easy to understand, and the data were easy to interpret. A high amount of time to implement the system and user unfriendliness were reported as disadvantages (Nijhof et al., Reference Nijhof, van Gemert-Pijnen, de Jong, Ankoné and Seydel2012). In the study of Van Someren et al., (Van Someren, Reference Van Someren2007), the number of nights needed for a reliable measure of sleep disturbances in the use of actigraphy was investigated. The authors recommended more than 7 days of recording for an acceptable reliability of interdaily stability.
Synthesis of results of sleep-related measurements
Feasibility was assessed for the ESS (Frohnhofen et al., Reference Frohnhofen, Popp, Willmann, Heuer and Firat2009; Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021), ESS-ALT (Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021), SCADS (Manni et al., Reference Manni, Sinforiani, Terzaghi, Rezzani and Zucchella2015) and wrist monitoring (Nijhof et al., Reference Nijhof, van Gemert-Pijnen, de Jong, Ankoné and Seydel2012). The completion rate was the most commonly reported criterion for feasibility among self- and proxy-administered measurements (Frohnhofen et al., Reference Frohnhofen, Popp, Willmann, Heuer and Firat2009; Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021; Manni et al., Reference Manni, Sinforiani, Terzaghi, Rezzani and Zucchella2015). Reasons for unsuccessful ratings or missing values included inadequate items related to the life of the participants (Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021), health status and age (Frohnhofen et al., Reference Frohnhofen, Popp, Willmann, Heuer and Firat2009) and cognitive status or missing relatives for ratings (Manni et al., Reference Manni, Sinforiani, Terzaghi, Rezzani and Zucchella2015). According to Bowen et al. (2009), only three of eight domains were evaluated across all included feasibility studies.
Three aspects of reliability were assessed: internal consistency, interrater reliability and test-retest reliability. In total, four studies (Curcio et al., Reference Curcio, Tempesta, Scarlata, Marzano, Moroni, Rossini, Ferrara and De Gennaro2013; Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021; Hjetland et al., Reference Hjetland, Nordhus, Pallesen, Cummings, Tractenberg, Thun, Kolberg and Flo2020; Manni et al., Reference Manni, Sinforiani, Zucchella, Terzaghi and Rezzani2013) investigated internal consistency. The Cronbach alpha coefficients for the ESS and ESS-ALT were insufficient (Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021). In other studies that investigated the internal consistency of the PSQI, SDI and SCADS, the results were found to be indeterminate. In these studies, small sample sizes (Curcio et al., Reference Curcio, Tempesta, Scarlata, Marzano, Moroni, Rossini, Ferrara and De Gennaro2013; Hjetland et al., Reference Hjetland, Nordhus, Pallesen, Cummings, Tractenberg, Thun, Kolberg and Flo2020), the lack of sample size calculations (Curcio et al., Reference Curcio, Tempesta, Scarlata, Marzano, Moroni, Rossini, Ferrara and De Gennaro2013; Hjetland et al., Reference Hjetland, Nordhus, Pallesen, Cummings, Tractenberg, Thun, Kolberg and Flo2020; Manni et al., Reference Manni, Sinforiani, Zucchella, Terzaghi and Rezzani2013) and the lack of information about confidence intervals (Curcio et al., Reference Curcio, Tempesta, Scarlata, Marzano, Moroni, Rossini, Ferrara and De Gennaro2013; Manni et al., Reference Manni, Sinforiani, Zucchella, Terzaghi and Rezzani2013) were reasons for concerns.
Interrater reliability and test-retest reliability were only investigated for the OSAI (Cohen-Mansfield et al., Reference Cohen-Mansfield, Werner and Marx1989). The interrater reliability was found to be indeterminate due to a small sample size and the lack of information about confidence intervals.
Two criteria of validity were assessed. Structural validity was only assessed for the SCADS (Manni et al., Reference Manni, Sinforiani, Zucchella, Terzaghi and Rezzani2013). The risk of bias was adequate, and measurement properties were considered insufficient. Criterion validity was assessed for the PSQI, OSAI and SDI (Cohen-Mansfield et al., Reference Cohen-Mansfield, Werner and Marx1989; Curcio et al., Reference Curcio, Tempesta, Scarlata, Marzano, Moroni, Rossini, Ferrara and De Gennaro2013; Hjetland et al., Reference Hjetland, Nordhus, Pallesen, Cummings, Tractenberg, Thun, Kolberg and Flo2020; Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003). The criterion validity was found to be indeterminate for the PSQI because the sample included for the analysis of criterion validity was unclear (Curcio et al., Reference Curcio, Tempesta, Scarlata, Marzano, Moroni, Rossini, Ferrara and De Gennaro2013). The criterion validity was sufficient for the OSAI (Cohen-Mansfield et al., Reference Cohen-Mansfield, Werner and Marx1989). In one study, the criterion validity of the SDI was insufficient (Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003) due to missing sensitivity, specificity and AUC values. In another study, the criterion validity of the SDI was found to be sufficient (Hjetland et al., Reference Hjetland, Nordhus, Pallesen, Cummings, Tractenberg, Thun, Kolberg and Flo2020).
Discussion
This systematic review included 15 studies. The measurements are heterogeneous in terms of operationalization (items, subscales, scoring), rater perspective and psychometric properties. The theoretical frameworks or definitions of sleep disturbances in the measurements are also heterogeneous.
Identified sleep-related measurements and theoretical backgrounds
Three self-administered measurements, three proxy-administered measurements and two technological devices were identified. Among the papers that described measurements specifically designed for people living with dementia (Sinforiani et al., Reference Sinforiani, Terzaghi, Pasotti, Zucchella, Zambrelli and Manni2007; Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003), none of them provided definitions or theoretical backgrounds for sleep disturbances, which could be derived from specific national guidelines such as the NICE guideline for people living with dementia (National Institute for Health and Care Excellence, 2018), the AWMF guideline for dementia (Deuschl and Maier, Reference Deuschl and Maier2016) or other similar guidelines that existed when the measurements were developed. Additionally, some of the studies specifically examined nursing home residents (Cohen-Mansfield et al., Reference Cohen-Mansfield, Werner and Marx1989), adults (Johns, Reference Johns1991), geriatric patients (Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021) and patients with mental disorders (Buysse et al., Reference Buysse, Reynolds, Monk, Berman and Kupfer1989); these studies did not use any important diagnostic criteria for sleep disturbances, such as the ICSD-3-TR (American Academy of Sleep Medicine, 2023), DSM-V-TR (American Psychiatric Association, 2022) or ICD-11 (World Health Organization, 2022) (or earlier versions), as sources when developing measurements. Using literature as a source during measurement construction was mentioned in three out of six development studies (Buysse et al., Reference Buysse, Reynolds, Monk, Berman and Kupfer1989; Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021; Johns, Reference Johns1991). In two cases, it remains unclear which type of literature was used (Buysse et al., Reference Buysse, Reynolds, Monk, Berman and Kupfer1989; Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021). In particular, the specific literature on perspectives on sleep for people living with dementia in nursing homes that presents psychosocial factors and individual aspects in detail is important to consider in future measurement development or adaptation studies (Dörner et al., Reference Dörner, Hüsken, Schmüdderich, Dinand, Dichter and Halek2023). Moreover, knowledge of healthcare professionals (Dörner et al., Reference Dörner, Hüsken, Schmüdderich, Dinand, Dichter and Halek2023; Nunez et al., Reference Nunez, Khan, Testad, Lawrence, Creese and Corbett2018; Webster et al., Reference Webster, Costafreda, Powell and Livingston2022; Webster et al., Reference Webster, Powell, Costafreda and Livingston2020b) and family carers (Nunez et al., Reference Nunez, Khan, Testad, Lawrence, Creese and Corbett2018) could be an important source. Simultaneously, only one development study (Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021) involved people living with dementia as stakeholders in the process. For research on complex interventions, the UK Medical Research Council requires the involvement of stakeholder engagement depending on context and phase of research and underlines the crucial importance for the selection of outcome measurements or evidence of change (Skivington et al., Reference Skivington, Matthews, Simpson, Craig, Baird, Blazeby, Boyd, Craig, French, McIntosh, Petticrew, Rycroft-Malone, White and Moore2021a, Reference Skivington, Matthews, Simpson, Craig, Baird, Blazeby, Boyd, Craig, French, McIntosh, Petticrew, Rycroft-Malone, White and Moore2021b).
For actigraphy, as a technological device, it is recommended to always describe the algorithm that is employed, the output procedure and scoring within the study (Camargos et al., Reference Camargos, Louzada and Nóbrega2013). None of the included studies reported all of these data. Only one study transparently reported the algorithm (Ancoli-Israel et al., Reference Ancoli-Israel, Clopton, Klauber, Fell and Mason1997). Another study of wrist monitoring reported the developed algorithm not sufficient (Nijhof et al., Reference Nijhof, van Gemert-Pijnen, de Jong, Ankoné and Seydel2012), and two studies used manufacturer algorithms that were not described in detail (Gibson and Gander, Reference Gibson and Gander2019; Van Someren, Reference Van Someren2007). Among the included studies, the algorithm for actigraphy described by Cole et al. (Reference Cole, Kripke, Gruen, Mullaney and Gillin1992) can be recommended for future studies because, in contrast to the other identified algorithms, their algorithm is openly accessible and transparently reported. Furthermore, this algorithm has already been applied in several other actigraphy studies (Biegański et al., Reference Biegański, Stróż, Dovgialo, Duszyk-Bogorodzka and Durka2021; Gao et al., Reference Gao, Li, Morris, Zheng, Ulsa, Gao, Scheer and Hu2022; Hanowski et al., Reference Hanowski, Hickman, Fumero, Olson and Dingus2007; Kikuchi et al., Reference Kikuchi, Yoshiuchi, Yamamoto, Komaki and Akabayashi2011; Kim et al., Reference Kim, Lee, Kim, Kim, Chung, Chung and Lee2013; Quante et al., Reference Quante, Kaplan, Cailler, Rueschman, Wang, Weng, Taveras and Redline2018). The reported recording time within the included studies varied between one and 24 weeks. The recommendation for at least one week (Camargos et al., Reference Camargos, Louzada and Nóbrega2013) recording time was reached in all studies that reported the recording time.
In relation to sleep parameters, the following variables should be recorded in actigraphy studies: night sleep time, number of nighttime awakenings, wake after sleep onset and sleep efficacy (Camargos et al., Reference Camargos, Louzada and Nóbrega2013). None of the included studies assessed all of these parameters. The total sleep time was most often reported (Ancoli-Israel et al., Reference Ancoli-Israel, Clopton, Klauber, Fell and Mason1997; Nijhof et al., Reference Nijhof, van Gemert-Pijnen, de Jong, Ankoné and Seydel2012; Van Someren, Reference Van Someren2007). Wake after sleep onset was reported in two studies (Ancoli-Israel et al., Reference Ancoli-Israel, Clopton, Klauber, Fell and Mason1997; Gibson and Gander, Reference Gibson and Gander2019), and sleep efficacy was reported in one study (Van Someren, Reference Van Someren2007). The number of nighttime awakenings was not reported in any study. Furthermore, no study used validated questionnaires or scales to assess subjective sleep complaints, which is recommended as a comparison measure when actigraphy is used to measure sleep (Camargos et al., Reference Camargos, Louzada and Nóbrega2013) or even described as a primary measure for diagnostic criteria other than the use of technological devices (American Psychiatric Association, 2022).
In summary, the heterogeneous and mostly insufficient reported theoretical basis indicates that concept clarification and strict reference to the evidence are vital for the further development of the measurements.
Psychometric properties of the included measurements
Related to feasibility, the completion rate according to the domain practicality of Bowen et al. (Reference Bowen, Kreuter, Spring, Cofta-Woerpel, Linnan, Weiner, Bakken, Kaplan, Squiers, Fabrizio and Fernandez2009) was most often reported within the included studies (Frohnhofen et al., Reference Frohnhofen, Popp, Willmann, Heuer and Firat2009; Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021; Manni et al., Reference Manni, Sinforiani, Terzaghi, Rezzani and Zucchella2015). Missing values or at least needed help for rating were commonly reported aspects in studies that measure health outcomes in people living with dementia (Jansen et al., Reference Jansen, van Hout, Nijpels, van Marwijk, Gundy, de Vet and Stalman2008; Khobragade et al., Reference Khobragade, Nichols, Meijer, Varghese, Banerjee, Dey, Lee, Gross and Ganguli2022; Novella et al., Reference Novella, Ankri, Morrone, Guillemin, Jolly, Jochum, Ploton and Blanchard2001a; Novella et al., Reference Novella, Jochum, Ankri, Morrone, Jolly and Blanchard2001b), and these phenomena should be considered when planning future studies with measurements. Compared to recommendations from the literature regarding feasibility studies (Bowen et al., Reference Bowen, Kreuter, Spring, Cofta-Woerpel, Linnan, Weiner, Bakken, Kaplan, Squiers, Fabrizio and Fernandez2009), in summary, the self-reported ESS-ALT (Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021) should be preferred because in terms of feasibility, it has a better completion rate (domain of practicality) and seems to be better in the domain of acceptability than the ESS (Frohnhofen et al., Reference Frohnhofen, Popp, Willmann, Heuer and Firat2009; Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021). Furthermore, after the adaptation process, the ESS-ALT (Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021) seemed to have a better fit for people living with dementia and people in the nursing home setting, but it still needs to be further adapted to the nursing home setting. If the ESS-ALT is used in the nursing home setting, an implication for subsequent feasibility studies would be the adaptation to the setting (domain of adaptation). This approach goes along with the examination of whether the content of the items is completely appropriate. For example, item 4 (“as a passenger in the car”) (Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021) may be inappropriate because this activity could no longer occur regularly for the majority of nursing home residents. Important in this context are among others, the user’s satisfaction with the measurement, the successful application (depending on the success of the adaptation) and the costs (Bowen et al., Reference Bowen, Kreuter, Spring, Cofta-Woerpel, Linnan, Weiner, Bakken, Kaplan, Squiers, Fabrizio and Fernandez2009).
The SCADS was the only proxy-administered measurement that was tested in terms of feasibility. The authors described two limitations in their study (Manni et al., Reference Manni, Sinforiani, Terzaghi, Rezzani and Zucchella2015) that are important to consider when assessing feasibility (Bowen et al., Reference Bowen, Kreuter, Spring, Cofta-Woerpel, Linnan, Weiner, Bakken, Kaplan, Squiers, Fabrizio and Fernandez2009): the administration of this measurement is not possible among people with severe dementia, and bedpartners/caregivers were not always available for the rating. Therefore, this measurement should be adapted. In addition to the general adaptation to the setting (domain of adaptation) according to Bowen et al. (Reference Bowen, Kreuter, Spring, Cofta-Woerpel, Linnan, Weiner, Bakken, Kaplan, Squiers, Fabrizio and Fernandez2009) one of the main implications is the adaptation to dementia severity. A key requirement regarding measurements is always to clarify for which concern and by whom they are used (Sheehan, Reference Sheehan2012). This implies that a measurement that was developed or adapted for people living with dementia should cover all degrees of dementia severity or that it should be explicitly developed, for example, for the early stages of dementia. Moreover, the rating should not be determined only from the proxy-rating perspective. In general, it may be preferable to assess sleep disturbances by the person concerned. If this is not possible, e.g. due to dementia severity, a combined rating of the person concerned and relatives, caregivers or other healthcare providers could be a suitable solution. Therefore, an adaptation option would be to adjust the SCADS rating for self- and proxy ratings.
One study (Nijhof et al., Reference Nijhof, van Gemert-Pijnen, de Jong, Ankoné and Seydel2012) assessed the feasibility of wrist monitoring. First, the costs of the measurement were reported for the technological installation and for each watch. Second, the administration seemed to be difficult because it took more time than expected to implement the measurement for caregivers, and there was a skepticism in the beginning regarding whether actigraphy would help the caregiver with their tasks. Third, regarding the use and usability of wrist monitors for daily monitoring, caregivers printed out nightly results and put information in the healthcare records, but this topic was not discussed in team meetings. Regarding usability, it was reported that actigraphy is not user-friendly and needs to be improved (e.g., it is too large for small arms, the hard strap irritates the skin, a normal clockface would be more familiar for elderly individuals). Caregivers were able to monitor data regarding a resident’s sleep quality, and the data were easy to interpret and understand (Nijhof et al., Reference Nijhof, van Gemert-Pijnen, de Jong, Ankoné and Seydel2012).The aspect of successful monitoring of sleep data with sleep monitoring devices such as actigraphy was mentioned in another study among elderly people (LeBlanc et al., Reference LeBlanc, Czarnecki, Howard, Jacelon and Marquard2022). However, it was also stated in the same publication that actigraphy has been used less over time. Trust regarding the performance of the wrist monitor was reported in this study as supportive, ambivalent or negative. With this in mind, the wrist monitoring system has several limitations and is currently not a feasible opportunity to assess sleep disturbances.
In Summary, only three of the eight domains according to Bowen et al. (Reference Bowen, Kreuter, Spring, Cofta-Woerpel, Linnan, Weiner, Bakken, Kaplan, Squiers, Fabrizio and Fernandez2009) were reported within the included feasibility studies. For future feasibility studies, this implies that all feasibility criteria should be evaluated.
Reliability was assessed for internal consistency, interrater reliability and test-retest reliability. None of the included measurements that were tested for reliability (Cohen-Mansfield et al., Reference Cohen-Mansfield, Werner and Marx1989; Curcio et al., Reference Curcio, Tempesta, Scarlata, Marzano, Moroni, Rossini, Ferrara and De Gennaro2013; Gronewold et al., Reference Gronewold, Lenuck, Gülderen, Scharf, Penzel, Johns, Frohnhofen and Hermann2021; Hjetland et al., Reference Hjetland, Nordhus, Pallesen, Cummings, Tractenberg, Thun, Kolberg and Flo2020; Manni et al., Reference Manni, Sinforiani, Zucchella, Terzaghi and Rezzani2013) showed sufficient psychometric results for several domains of reliability.
A clear recommendation in terms of measurements with the best reliability for people living with dementia in the nursing home setting is therefore not possible for the proxy-administered measurements analyzed herein due to the limitations of each study. For self-administered measurements, the PSQI (Buysse et al., Reference Buysse, Reynolds, Monk, Berman and Kupfer1989) seems to be potentially eligible. It is the most frequently used self-measurement in different settings (Fabbri et al., Reference Fabbri, Beracci, Martoni, Meneo, Tonetti and Natale2021) but has various factor structures and needs to be adapted (Manzar et al., Reference Manzar, BaHammam, Hameed, Spence, Pandi-Perumal, Moscovitch and Streiner2018). For the population of people with dementia in particular, first, feasibility should be adapted with different guidelines (Beaton et al., Reference Beaton, Bombardier, Guillemin and Ferraz2000; Bowen et al., Reference Bowen, Kreuter, Spring, Cofta-Woerpel, Linnan, Weiner, Bakken, Kaplan, Squiers, Fabrizio and Fernandez2009) and psychometrically analyzed again.
Regarding validity, the structural validity for one measurement (Manni et al., Reference Manni, Sinforiani, Zucchella, Terzaghi and Rezzani2013) and criterion validity for three measurements (Cohen-Mansfield et al., Reference Cohen-Mansfield, Werner and Marx1989; Curcio et al., Reference Curcio, Tempesta, Scarlata, Marzano, Moroni, Rossini, Ferrara and De Gennaro2013; Hjetland et al., Reference Hjetland, Nordhus, Pallesen, Cummings, Tractenberg, Thun, Kolberg and Flo2020; Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003) were assessed. Actigraphy was used as a reference standard for the SDI in two studies (Hjetland et al., Reference Hjetland, Nordhus, Pallesen, Cummings, Tractenberg, Thun, Kolberg and Flo2020; Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003) and for sleep variables compared to the OSAI (Cohen-Mansfield et al., Reference Cohen-Mansfield, Werner and Marx1989). The DSM-V-TR (American Psychiatric Association, 2022) states that for insomnia disorder diagnosis, the individual’s subjective perception or a caregiver report is needed. Additionally, symptoms can be quantified by sleep diaries or actigraphy (American Psychiatric Association, 2022). Therefore, further research should be conducted to determine whether actigraphy is the best measurement approach to use as a reference standard in the included studies. Moreover, quantitative criteria are often used in research designs but cannot reliably distinguish between individuals with insomnia and normal sleepers. Therefore, it is recommended that quantitative guidelines for measuring the frequency and duration of sleep should only be used for illustrative purposes (American Psychiatric Association, 2022). Because of the insufficient results, the risk of bias analysis, the recommendations regarding actigraphy (Camargos et al., Reference Camargos, Louzada and Nóbrega2013) and the statements from the ICSD-3-TR and DSM-V-TR, no recommendation can be provided due to the lack of validity of the measurements examined herein.
Recommendations for clinical practice
Sleep disturbances including disturbed sleep at night and daytime sleepiness (American Academy of Sleep Medicine, 2023). An important aspect when choosing a measurement for clinical practice is the consideration of both issues. Regarding the combination of different perspectives in terms of rating and day- and night of the included measurements, only the PSQI (Buysse et al., Reference Buysse, Reynolds, Monk, Berman and Kupfer1989) as a self-measurement and the SDI (Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003) as a proxy-measurement are potentially eligible for these concerns. Both measurements could be used in combination with careful reflection of the results because of the lack of psychometric testing. The PSQI (Buysse et al., Reference Buysse, Reynolds, Monk, Berman and Kupfer1989) could be challenging for people living with dementia because of its length and complexity. A negative aspect of the SDI (Tractenberg et al., Reference Tractenberg, Singer, Cummings and Thal2003) could be, that only one item for daytime sleepiness exists.
Strengths and limitations
A strength of this publication is that this is the first review that investigated the psychometric properties of self- and proxy-administered sleep-related measurements to assess sleep disturbances among people living with dementia. Although the aim of this review was to identify and recommend measurements for the nursing home setting, measurements for all settings were included to detect a higher number of measurements that can be potentially adapted for people living with dementia in nursing homes. One limitation of this study is that no protocol for this review was externally registered. Second, the samples of the included studies did not exclusively comprise people living with dementia. However, it was important to analyze all studies of measurements to measure sleep among people living with dementia to obtain information about potentially usable measurements. Third, the actigraphy algorithms were not compared, because three out of four algorithms were provided by manufacturers of technological devices without transparent reporting. Thus, it was not possible to compare the different approaches in more detail.
Conclusion
This systematic review identified eight measurements that have undergone psychometric analysis. The theoretical definitions of sleep disturbances were often poorly described within included measurements. Therefore, it is difficult to determine whether the construct of sleep was comprehensively considered in the development process with respect to content validity and the specific aim of the measurement. Moreover, none of the measurements were evaluated across all psychometric properties. Furthermore, the large number of measurements with insufficient or unclear reliability and validity shows that further research is needed to accurately assess sleep disturbances among people living with dementia. Currently, none of the measurements identified here can be recommended for use without further development in intervention studies.
Criteria-based decision making (e.g., the COSMIN methodology) is necessary for the selection of the optimal measurement. The identified technological measurements can be used to obtain secondary outcomes but not for primary outcomes. Previous studies used technological measurements to obtain primary outcomes, but this practice contradicts the recommendations of international diagnostic criteria. Future actigraphy studies should use open access algorithms to increase transparency. Researchers should quantify sleep for illustrative purposes and not as a primary outcome for detecting sleep disturbances or as a reference standard in diagnostic accuracy studies. This review indicates that no currently available sleep-related measurement can be recommended without strong reservations for assessing sleep disturbances among people living with dementia in nursing homes. However, a combination of self- and proxy assessments seems to be the best option to achieve valid measurements of sleep disturbances among people living with dementia.
Abbreviations
COSMIN: COnsensus-based Standards for the Selection of Health Measurement INstruments; DSM-V-TR: Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, Text Revision; ESS: Epworth Sleepiness Scale; ESS-ALT: Epworth Sleepiness Scale Alternative Version; ICD-11: International Classification of Diseases 11th Revision; ICSD-3-TR: International Classification of Sleep Disorders – Third Edition, Text Revision; NPI: Neuropsychiatric Inventory; PSQI: Pittsburgh Sleep Quality Index; QAREL: Quality Appraisal Tool for Studies of Diagnostic Reliability; SCADS: Sleep Continuity Scale in Alzheimer’s Disease; SDI: Sleep Disorders Inventory.
Conflict of interests
None.
Funding
This review was undertaken at the DZNE, which receives basic funding from the Federal Ministry of Education and Research and the state of North Rhine-Westfalia.
Description of the authors’ roles
Study Design: JD, MND, MH
Literature Search: JD, MND
Data Analysis: JD, KW, MND
First Draft of the Manuscript: JD, KW, MH, MND
Manuscript Preparation: JD, KW, MH, MND
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S104161022400070X.