Introduction
Sleep is paramount to mental health as poor sleep quality is associated with various mental health problems. Sleep problems are associated with stress and reduced quality of life [Reference Riemann, Nissen, Palagini, Otte, Perlis and Spiegelhalder1] and are considered a risk factor for mental health problems globally [2]. Sleep difficulties are part of the diagnostic criteria for major depressive disorder (MDD), anxiety disorder, and post-traumatic stress disorder (PTSD) [3]. As such, treatments enhancing sleep quality are key to improving mental health (in this paper, we used the terms mental health problems and mental health conditions interchangeably to cover clinically diagnosed mental disorders and other non-diagnosed mental health problems).
Besides playing an important role in various mental conditions, sleep quality is also strongly associated with general health and well-being. One recent meta-analysis showed that better sleep quality was associated with improvements in composite mental health, depression, anxiety, and rumination [Reference Scott, Webb, Martyn-St James, Rowse and Weich4]. Furthermore, sleep quality also has a positive impact on physical health and well-being. Studies have found that sleep improvements were linked to the improvement of self-reported physical health, work performance, and cognitive abilities [Reference Afolalu, Ramlee and Tang5–Reference Lim and Dinges7].
Due to the robust benefits of good sleep quality for both mental and physical health, different types of treatments for improving sleep quality have been investigated. Standard treatments for improving sleep problems include medication and cognitive behavioral therapy for insomnia (CBT-I). Medications such as eszopiclone and lemborexant showed effective improvement in sleep quality, but the safety and potential adverse events, such as dependency and daytime drowsiness, should be considered [Reference De Crescenzo, D’Alò, Ostinelli, Ciabattini, Di Franco and Watanabe8]. Moreover, the recent European guidelines for insomnia treatment recommend pharmacotherapy only as a short-term solution [Reference Riemann, Espie, Altena, Arnardottir, Baglioni and Bassetti9]. An alternative option is CBT-I, which is recommended as a first-line treatment. Studies have found that CBT-I is an effective treatment for improving sleep quality in different populations [Reference Ma, Hall, Ngo, Liu, Bain and Yeh10, Reference Reynolds, Sweetman, Crowther, Paterson, Scott and Lechat11]. However, the use of CBT-I can be limited by the fact that it is demanding to complete, and not all patients achieve remission [Reference Krystal12, Reference Sateia, Buysse, Krystal, Neubauer and Heald13]. Furthermore, it can be hard to find available treatment in many countries, and it may be expensive [Reference Wilson, Anderson, Baldwin, Dijk, Espie and Espie14]. Therefore, it is common to seek other treatment alternatives.
Many people do not seek clinical treatment for their sleep problems but instead engage in various alternative strategies to improve sleep [Reference Léger, Poursain, Neubauer and Uchiyama15]. One common strategy is to use music as a non-pharmacological sleep aid [Reference Morin, LeBlanc, Daley, Gregoire and Mérette16–Reference Brown, Qin and Esmail18]. In clinical research, music interventions for sleep have been examined in systematic reviews and meta-analyses showing a beneficial effect of music on sleep quality in different populations such as critically ill patients [Reference Kakar, Billar, van Rosmalen, Klimek, Takkenberg and Jeekel19, Reference Jespersen, Hansen and Vuust20], older adults [Reference Sella, Toffalini, Canini and Borella21, Reference Wang, Li, Zheng, Meng, Meng and Wang22], adults with insomnia [Reference Jespersen, Pando-Naude, Koenig, Jennum and Vuust23], and women with pregnancy [Reference Paulino, Borrelli, Faria-Schützer, Brito and Surita24, Reference Høgholt, Kraenge, Vuust, Kringelbach and Jespersen25]. Music is increasingly used in healthcare settings [Reference Jespersen, Gebauer and Vuust26], and several mechanisms may underlie the impact of music on sleep. For example, music can facilitate psychological/physical relaxation, regulate the emotional state of the listener, distract from negative thoughts, or mask noisy environments [Reference Dickson and Schubert27, Reference Jespersen28]. Biologically, studies have shown that music can impact part of the neurochemical system and down-regulate cortisol levels, which can facilitate relaxation and sleep [Reference Chanda and Levitin29, Reference Nilsson30]. The advantages of using music to improve sleep are that it is easy and safe to implement in individuals’ daily lives. So far, no systematic review has evaluated the evidence of the effect of music on sleep in adults with mental health problems. Therefore, the main objective of this study was to evaluate whether music is effective as a sleep aid in adults with mental health problems, and the secondary aim was to explore the effect on related mental health outcomes.
Methods
Search strategy
This systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [Reference Page, Moher, Bossuyt, Boutron, Hoffmann and Mulrow31], and the review protocol was pre-registered in the Prospero database (registration number: CRD42023421382). We used the following electronic databases to conduct a systematic search: Pubmed, Scopus, Embase, PsycINFO, RILM, Cinahl, ClinicalTrial, Cochrane database, China National Knowledge Infrastructure, OSF database, and WHO trial register database. The search string included terms of music, sleep, and mental health problems. Specifically, search terms related to mental health problems were adapted from a previous systematic review on insomnia and mental disorders [Reference Hertenstein, Feige, Gmeiner, Kienzler, Spiegelhalder and Johann32] (see Supplementary Table S1).
Inclusion and exclusion criteria
For this systematic review, we followed the pre-specified inclusion and exclusion criteria. We included quantitative studies (randomized controlled trials, cohort studies, and case-control studies) investigating the effect of music interventions on sleep in adults (age > 18 years old) with mental health problems. We included studies that employed music listening as an intervention. Other types of music interventions, such as songwriting, improvisation, or rhythmic auditory stimulation (in a non-music context), were excluded. In the case of combined interventions, studies were included when music intervention was the main component. We excluded studies with participants who were diagnosed with neurodegenerative disorders or younger than 18 years old. The primary outcome was subjective sleep quality. The secondary outcomes were objective sleep measurement (e.g., actigraphy and polysomnography sleep parameters) and psychological well-being (e.g., quality of life, mental illnesses symptoms severity).
Data extraction and synthesis
Record selection and data extraction were independently conducted by two researchers (NZ and KVJ) using the online software Covidence [33]. After duplication detection, the two researchers independently screened the titles and abstracts to remove non-relevant reports. The relevant full-text papers were retrieved and evaluated according to the pre-defined exclusion/inclusion criteria. Disagreements were resolved by discussion. For the included studies, the following information was extracted: publication years, study designs, sample size, attrition rate, age, sex distribution, primary/secondary outcomes, and intervention characteristics. The extracted data of the included trials was included in a narrative synthesis for each outcome. When enough studies reported a predefined outcome, we evaluated them and used qualified studies to conduct a meta-analysis. For the meta-analyses, we only included studies that compared the music intervention group with a control group, which was defined either as treatment-as-usual or no intervention. Studies that compared music intervention to an active intervention group (e.g., medication, acupuncture, or meditation) were excluded. Authors were contacted if additional data was needed for meta-analyses.
We used random effect models for both primary and secondary outcome meta-analyses. The outcomes were continuous, and we used standardized mean differences (SMD) when outcomes were measured on different scales. The direction of the scales was checked and reversed in case of discrepancy. To correct for potential small-sample size bias (n < 20), we applied Hedges’ g [Reference Hedges34]. We used a restricted maximum likelihood (REML) estimator in the model due to its robustness for continuous variables and the Knapp–Hartung adjustment to reduce the risk of committing the type I error. We used I 2 statistics tested between-study heterogeneity. A group of outlier diagnostic tests was conducted to detect outliers or influential cases. Quantitatively identified influential cases were further evaluated by their traits, such as study design and intervention characteristics. If a study used multiple scales to measure the same psychophysiological construct (e.g., sleep quality, depression, anxiety), each scale was tested separately in the meta-analysis models.
All analyses were conducted through R version 4.2.0 [35], with “dmetar” [Reference Harrer, Cuijpers, Furukawa and Ebert36] and “meta” [Reference Balduzzi, Rücker and Schwarzer37] packages.
Risk of bias assessment
To evaluate the risk of bias, we used the Cochrane risk-of-bias tool version I (RoB1) [Reference Higgins, Altman, Gøtzsche, Jüni, Moher and Oxman38]. Three researchers independently evaluated the studies (NZ, HNL, KVJ), and disagreements were resolved through discussion. Studies were rated with low, high, or unclear risk in the following domains: random sequence generation, allocation concealment, blinding of participants and researchers, blinding of outcome assessment, incomplete outcome data, selective reporting, and other biases. The unclear risk was given when there was not enough information for the assessment. In addition, we used funnel plots to evaluate publication bias. In case a researcher was the author of an included study, the risk of bias assessment was done by two authors not involved in the specific study.
Results
Search results
We conducted database searches in April 2023. After the duplication detection, we identified 1492 records. In the subsequent title and abstract screening phase, we excluded 1443 records. The remaining 49 records proceeded to the full-text retrieval stage. After the full-text assessment, 15 studies were included in the systematic review. Of these 15 studies, seven studies could be included in the meta-analysis for the primary outcome “subjective sleep quality”, five studies for a meta-analysis of the secondary outcome “depression”, and five studies for a meta-analysis of the secondary outcome “anxiety” (see Figure 1).
Characteristics of included studies
The characteristics of the included studies are presented in Table 1. The publication year of the 15 included studies ranged from 2005 to 2022. The majority of the studies were conducted in China [Reference Liu, Yang, Wan, Tian, Wu, Xu and Zhang39–Reference JL, YH and LP41], Denmark [Reference Jespersen and Vuust42, Reference Lund, Pedersen, Heymann-Szlachcinska, Tuszewska, Bizik and Larsen43], Israel [Reference Blanaru, Bloch, Vadas, Arnon, Ziv and Kremer44, Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45], and the US [Reference Hernández-Ruiz46, Reference Wahbeh and Nelson47]; the other six studies were conducted in different countries [Reference Lu, Chen and Li48–Reference Braun Janzen, Al Shirawi, Rotzinger, Kennedy and Bartel53]. Nine of the included studies were randomized control trials [Reference Liu, Yang, Wan, Tian, Wu, Xu and Zhang39–Reference JL, YH and LP41, Reference Lund, Pedersen, Heymann-Szlachcinska, Tuszewska, Bizik and Larsen43, Reference Blanaru, Bloch, Vadas, Arnon, Ziv and Kremer44, Reference Wahbeh and Nelson47, Reference Ahlberg, Skårberg, Brus and Kjellin49, Reference Lee, Lee, Ahn, Hong and Yoon50, Reference Deshmukh, Sarvaiya, Seethalakshmi and Nayak52], and four studies were pre-registered before they conducted the experiments [Reference Lund, Pedersen, Heymann-Szlachcinska, Tuszewska, Bizik and Larsen43, Reference Ahlberg, Skårberg, Brus and Kjellin49, Reference Lee, Lee, Ahn, Hong and Yoon50, Reference Braun Janzen, Al Shirawi, Rotzinger, Kennedy and Bartel53].
Abbreviations: AUD: Alcohol use disorder, BAI: Beck Anxiety Inventory, BDI: Beck Depression Inventory, CESD: Center for Epidemiologic Studies Depression Scale, HAMA: Hamilton Anxiety Scale, HAMD: Hamilton Depression Rating Scale, MADRS: Montgomery Asberg Depression Rating Scale, MSQ: mini-sleep questionnaire, PSQI: Pittsburgh Sleep Quality Index, PTSD: Post-traumatic stress disorder, RCSQ: Richards-Campbell Sleep Questionnaire, RCT: Randomized controlled trial, SAS: Zung Self-Rating Anxiety Scale, SDS: Zung Self-Rating Depression Scale, STAI: State-Trait Anxiety Inventory, TAU: Treatment as usual, *: Participants’ choice among pre-selected playlists.
a Studies qualified for sleep quality outcome meta-analysis.
b Studies qualified for depression outcome meta-analysis.
c Studies qualified for anxiety outcome meta-analysis.
The included studies had a total of 1,120 participants. The average sample size was 75 and the sample size of the single study ranged from 13 to 280 participants.
The majority of the studies (n = 7) focused on participants with depression [Reference Liu, Yang, Wan, Tian, Wu, Xu and Zhang39–Reference JL, YH and LP41, Reference Lund, Pedersen, Heymann-Szlachcinska, Tuszewska, Bizik and Larsen43, Reference Wahbeh and Nelson47, Reference Deshmukh, Sarvaiya, Seethalakshmi and Nayak52, Reference Braun Janzen, Al Shirawi, Rotzinger, Kennedy and Bartel53]. Other mental problems included trauma and PTSD [Reference Jespersen and Vuust42, Reference Blanaru, Bloch, Vadas, Arnon, Ziv and Kremer44, Reference Hernández-Ruiz46], Schizophrenia [Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45, Reference Lu, Chen and Li48], alcohol use disorder [Reference Ahlberg, Skårberg, Brus and Kjellin49] and stress [Reference Lee, Lee, Ahn, Hong and Yoon50]. Most of the studies used music described as soft and slow. Some studies used culturally unique music from specific countries [Reference Liu, Yang, Wan, Tian, Wu, Xu and Zhang39–Reference JL, YH and LP41, Reference Lu, Chen and Li48, Reference Deshmukh, Sarvaiya, Seethalakshmi and Nayak52]. The duration of the music intervention sessions ranged from 20 to 60 min with an average of 40 min across all studies. The intervention period ranged from 5 to 42 days. Five studies compared the music intervention group with other active intervention groups (e.g., acupuncture, medication, and cognitive behavioral therapy) [Reference Wahbeh and Nelson47, Reference Ahlberg, Skårberg, Brus and Kjellin49, Reference Lee, Lee, Ahn, Hong and Yoon50–Reference Deshmukh, Sarvaiya, Seethalakshmi and Nayak52], and others compared with passive control or treatment-as-usual groups.
Outcomes
Twelve studies reported the primary outcome of subjective sleep quality using self-report questionnaires [Reference JL, YH and LP41–Reference Lu, Chen and Li48, Reference Lee, Lee, Ahn, Hong and Yoon50–Reference Braun Janzen, Al Shirawi, Rotzinger, Kennedy and Bartel53]. Among them, nine studies used The Pittsburgh Sleep Quality Index (PSQI) [Reference JL, YH and LP41–Reference Lund, Pedersen, Heymann-Szlachcinska, Tuszewska, Bizik and Larsen43, Reference Hernández-Ruiz46–Reference Lu, Chen and Li48, Reference Lee, Lee, Ahn, Hong and Yoon50, Reference Deshmukh, Sarvaiya, Seethalakshmi and Nayak52, Reference Braun Janzen, Al Shirawi, Rotzinger, Kennedy and Bartel53]. The PSQI consists of 19 items with a total score ranging from 0 to 21 and higher scores indicating a more severe sleep problem [Reference Buysse, Reynolds, Monk, Berman and Kupfer54]. Two studies used the mini-sleep questionnaire (MSQ). This scale consists of 10 items. Each item ranges from 1 (never) to 7 (always), and higher scores indicate more severe sleep problems [Reference Zomer, Peled, Rubin, Lavie, Koella, Ruther and Schulz55]. One study applied the Richards-Campbell Sleep Questionnaire (RCSQ) [Reference de Niet, Tiemens and Hutschemaekers51, Reference Richards, O’Sullivan and Phillips56]. This scale consists of five items, with a higher score indicating better sleep quality.
The secondary outcomes included additional sleep and mental health measures. Five studies measured objective sleep parameters such as sleep efficiency, sleep latency, and total sleep time [Reference Liu, Yang, Wan, Tian, Wu, Xu and Zhang39, Reference Lu, Nie and Chen40, Reference Lund, Pedersen, Heymann-Szlachcinska, Tuszewska, Bizik and Larsen43–Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45]. In addition, two studies measured insomnia severity with the insomnia severity index (ISI) [Reference Ahlberg, Skårberg, Brus and Kjellin49, Reference Lee, Lee, Ahn, Hong and Yoon50, Reference Bastien, Vallières and Morin57]. ISI is a seven-item questionnaire with higher scores indicating more severe insomnia symptoms.
Regarding mental health outcomes, nine studies reported measurement of depression symptoms [Reference Liu, Yang, Wan, Tian, Wu, Xu and Zhang39–Reference JL, YH and LP41, Reference Lund, Pedersen, Heymann-Szlachcinska, Tuszewska, Bizik and Larsen43–Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45, Reference Wahbeh and Nelson47, Reference Deshmukh, Sarvaiya, Seethalakshmi and Nayak52, Reference Braun Janzen, Al Shirawi, Rotzinger, Kennedy and Bartel53], and the following measurements were employed: the Montgomery Asberg Depression Rating Scale (MADRS) [Reference Montgomery and Asberg58], the Beck Depression Inventory (BDI) [Reference Beck, Ward, Mendelson, Mock and Erbaugh59], the Zung Self-Rating Depression Scale (SDS) [Reference Zung60], the Center for Epidemiologic Studies Depression Scale (CES-D) [Reference Eaton, Muntaner, Smith, Tien, Ybarra and Maruish61], and Hamilton Depression Rating Scale (HAM-D) [Reference Hamilton62]. Eight studies measured anxiety symptoms [Reference Liu, Yang, Wan, Tian, Wu, Xu and Zhang39–Reference JL, YH and LP41, Reference Blanaru, Bloch, Vadas, Arnon, Ziv and Kremer44–Reference Hernández-Ruiz46, Reference Ahlberg, Skårberg, Brus and Kjellin49, Reference Lee, Lee, Ahn, Hong and Yoon50]. The following anxiety scales were used: the Beck Anxiety Inventory (BAI) [Reference Beck, Epstein, Brown and Steer63], the State-Trait Anxiety Inventory (STAI) [Reference Spielberger, Gonzalez-Reigosa, Martinez-Urrutia, Natalicio and Natalicio64], the Zung Self-Rating Anxiety Scale (SAS) [Reference Zung65], and the Hamilton Anxiety Scale (HAMA) [Reference Hamilton66].
Risk of bias
We evaluated the risk of bias for all 15 studies using the Cochrane Risk of Bias assessment tool (RoB version I) (see Figure 2). The highest level of risk was in the domain of blinding of participants and researchers (see Supplementary Figure S1). Given the fact that music interventions are difficult to implement without the awareness of the participants, 14 out of 15 studies were rated as high risk of bias as they did not design any procedures for double-blinding. One study reported the use of a double-blind method without providing additional details and was rated as an unclear risk of bias [Reference Lee, Lee, Ahn, Hong and Yoon50]. The second highest level of risk was the domain of blinding of outcome assessment. Eleven studies did not provide information from this domain and were rated as unclear risk of bias. One study was rated as high risk of bias because only one researcher conducted the intervention; therefore, it was unlikely that the assessor was blinded [Reference Jespersen and Vuust42]. Three studies used blinding procedures for the outcome examiner and were evaluated as low risk of bias [Reference JL, YH and LP41, Reference Wahbeh and Nelson47, Reference Deshmukh, Sarvaiya, Seethalakshmi and Nayak52].
The results of the selective reporting domain were mostly unclear due to the low study pre-registration rate. Specifically, 10 studies did not pre-register their experiments. Four studies that had pre-registration were rated as low risk of bias after we compared their reported and pre-registration outcomes [Reference Lund, Pedersen, Heymann-Szlachcinska, Tuszewska, Bizik and Larsen43, Reference Ahlberg, Skårberg, Brus and Kjellin49, Reference Lee, Lee, Ahn, Hong and Yoon50, Reference Braun Janzen, Al Shirawi, Rotzinger, Kennedy and Bartel53]. One study was rated at a high risk of bias due to the multiple comparisons on sub-items from the sleep quality questionnaire [Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45].
For the random sequence generation domain, five studies did not use the randomized controlled method and were rated as high risk of bias [Reference Jespersen and Vuust42, Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45, Reference Hernández-Ruiz46, Reference de Niet, Tiemens and Hutschemaekers51, Reference Braun Janzen, Al Shirawi, Rotzinger, Kennedy and Bartel53]. Five studies had randomized procedures, but the method was not further explained and therefore, was rated as an unclear risk of bias [Reference Lu, Nie and Chen40, Reference Blanaru, Bloch, Vadas, Arnon, Ziv and Kremer44, Reference Wahbeh and Nelson47, Reference Lu, Chen and Li48, Reference Deshmukh, Sarvaiya, Seethalakshmi and Nayak52]. Five studies used a randomized controlled method with a clear explanation of their method and were rated as low risk of bias [Reference JL, YH and LP41, Reference Lund, Pedersen, Heymann-Szlachcinska, Tuszewska, Bizik and Larsen43, Reference Lu, Chen and Li48–Reference Lee, Lee, Ahn, Hong and Yoon50]. These ratings were the same for the allocation concealment domain.
Most of the studies (n = 11) showed a low attrition rate and were rated as low risk of bias. Four studies were rated as high risk of bias in the domain of incomplete outcome data due to their high drop-out rate [Reference Jespersen and Vuust42, Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45, Reference Ahlberg, Skårberg, Brus and Kjellin49, Reference de Niet, Tiemens and Hutschemaekers51]. For the other bias domain, one study was found not to report other treatments of adults with PTSD during the intervention, therefore, it was rated as an unclear risk [Reference Jespersen and Vuust42]. Other studies were rated as low risk of bias in this domain.
We used funnel plots to examine the risk of publication bias. After the visual inspection, asymmetrical patterns from the funnel plots were spotted. After the removal of the outliers in all meta-analyses, the data points became more symmetrical compared to the original model (see Supplementary Figures S2.1, S2.2, S2.3). However, due to the small sample size (n < 10) and high heterogeneity (I 2 > 75%), the results could be biased.
Primary outcome
A meta-analysis was conducted on the subjective sleep quality outcome (n = 7). As mentioned above, this outcome was reported by 12 studies. Three studies were excluded from the meta-analysis due to comparing with active intervention groups [Reference Wahbeh and Nelson47, Reference Lee, Lee, Ahn, Hong and Yoon50, Reference Deshmukh, Sarvaiya, Seethalakshmi and Nayak52], and two were excluded because there was no control group [Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45, Reference Braun Janzen, Al Shirawi, Rotzinger, Kennedy and Bartel53].
The results of the meta-analysis showed a large but statistically non-significant effect on subjective sleep quality between the music intervention and control groups in individuals with mental health problems (g = −1.09, 95% CI [−2.37, 0.19], t = −2.09, p = 0.0814). The model showed high heterogeneity (I 2 = 90.5%), further supporting a random effect model. We ran multiple diagnostic tests and identified Lu et al. (2022) [Reference Lu, Chen and Li48] as an outlier (see Supplementary Figure S3.1). The 95% CI of the outlier case did not overlap with the pooled effects. After a re-run of the meta-analysis without the outlier, we found a moderate statistically significant difference in the effect of the music intervention on sleep quality (g = −0.66, 95% CI [−1.19, −0.13], t = −3.21, p = 0.0236) (see Figure 3). We examined the qualitative characteristics of the outlier case, but the study design elements (e.g., intervention duration and music selection) and sample size did not stand out from others. Similarly, the study did not stand out regarding the Risk of Bias assessment.
Similar to the results of the meta-analysis, four out of the five studies not included in the meta-analysis showed subjective sleep quality improvement within the music intervention group [Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45, Reference Lee, Lee, Ahn, Hong and Yoon50, Reference Deshmukh, Sarvaiya, Seethalakshmi and Nayak52, Reference Braun Janzen, Al Shirawi, Rotzinger, Kennedy and Bartel53]; one study found no effect within the music group [Reference Wahbeh and Nelson47]. Music intervention groups were found to have no significant differences in sleep outcomes from other active intervention groups such as ASMR listening or medication [Reference Lee, Lee, Ahn, Hong and Yoon50, Reference Deshmukh, Sarvaiya, Seethalakshmi and Nayak52].
Secondary Outcomes
Depression
The second meta-analysis was conducted on depression symptoms (n = 5). Depression symptoms were reported in nine studies. Following the same exclusion/inclusion criteria as the sleep quality meta-analysis, four studies were excluded due to comparison with active intervention groups or no control group [Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45, Reference Lee, Lee, Ahn, Hong and Yoon50, Reference Deshmukh, Sarvaiya, Seethalakshmi and Nayak52, Reference Braun Janzen, Al Shirawi, Rotzinger, Kennedy and Bartel53]. The results showed a large but statistically non-significant effect on depression symptoms between the music intervention and control groups (g = −1.01, 95% CI [−2.64, 0.63], t = −1.71, p = 0.1633). The random-effect model showed high heterogeneity (I 2 = 93.7%). We then ran the diagnostic analyses and identified Yang et al. (2021) [Reference JL, YH and LP41] as an influential case (see Supplementary Figure S3.2). After we re-ran the model without the outlier, the effect was moderate and still statistically non-significant (g = −0.46, 95% CI [−1.36, 0.43], t = −1.64, p = 0.1991) (see Figure 4). When examining the characteristics of the outlier study [Reference JL, YH and LP41], we found that the baseline depression symptoms of the participants were high (HAMD = 23). However, this single trait was not strong enough to explain the large intervention effect of the study. Therefore, the cause of the large effect size was still unclear.
All four studies that were not included in the meta-analysis showed improvement in depression symptoms within the music group [Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45, Reference Lee, Lee, Ahn, Hong and Yoon50, Reference Deshmukh, Sarvaiya, Seethalakshmi and Nayak52, Reference Braun Janzen, Al Shirawi, Rotzinger, Kennedy and Bartel53]. Furthermore, the music groups had no statistically significant differences compared with other active intervention groups, including medication and ASMR sounds [Reference Lee, Lee, Ahn, Hong and Yoon50, Reference Deshmukh, Sarvaiya, Seethalakshmi and Nayak52].
Anxiety
The third meta-analysis was conducted on anxiety symptoms (n = 5). Eight of the included studies reported anxiety symptoms. Two were excluded due to comparisons of the active intervention groups [Reference Ahlberg, Skårberg, Brus and Kjellin49, Reference Lee, Lee, Ahn, Hong and Yoon50], and one was excluded due to no control group [Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45]. The results showed a large but statistically non-significant effect on anxiety symptoms between the music intervention and control groups (g = −1.97, 95% CI [−4.59, 0.65], t = −2.09, p = 0.1052). After running the diagnostic analyses (see Supplementary Figure S3.3), the same influential case from the depression meta-analysis, Yang et al. (2021) [Reference JL, YH and LP41] was identified. After running the model without the outlier, the effect was still large and statistically non-significant (g = −1.12, 95% CI [−2.25, 0.01], t = −3.15, p = 0.0512) (see Figure 5).
Another sensitivity analysis was conducted because one of the included studies [Reference Blanaru, Bloch, Vadas, Arnon, Ziv and Kremer44] measured anxiety twice with two different scales in one experiment (STAI and HAMA). HAMA was used for the final analysis since there were no statistically significant differences between the two scales.
The studies not included in the meta-analysis showed mixed results on the anxiety outcomes. Two studies found improvement in anxiety symptoms within the music groups [Reference Ahlberg, Skårberg, Brus and Kjellin49, Reference Lee, Lee, Ahn, Hong and Yoon50], but one did not [Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45]. In addition, no differences were found between the music group and other active intervention groups, including acupuncture and ASMR sounds [Reference Ahlberg, Skårberg, Brus and Kjellin49, Reference Lee, Lee, Ahn, Hong and Yoon50].
Other sleep outcomes
Five studies reported objective sleep outcomes. Four studies used actigraphy as an objective sleep measure [Reference Liu, Yang, Wan, Tian, Wu, Xu and Zhang39, Reference Lund, Pedersen, Heymann-Szlachcinska, Tuszewska, Bizik and Larsen43–Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45], and one did not specify the method for objective measurement of sleep [Reference Lu, Nie and Chen40]. Within-group or between-group sleep improvement with the music intervention was found in sleep outcomes such as sleep efficiency [Reference Liu, Yang, Wan, Tian, Wu, Xu and Zhang39, Reference Lu, Nie and Chen40, Reference Blanaru, Bloch, Vadas, Arnon, Ziv and Kremer44, Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45], sleep latency [Reference Lu, Nie and Chen40, Reference Blanaru, Bloch, Vadas, Arnon, Ziv and Kremer44, Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45], and wake after sleep onset [Reference Liu, Yang, Wan, Tian, Wu, Xu and Zhang39, Reference Lu, Nie and Chen40, Reference Blanaru, Bloch, Vadas, Arnon, Ziv and Kremer44]. Two studies reported total sleep time with mixed results [Reference Lu, Nie and Chen40, Reference Bloch, Reshef, Vadas, Haliba, Ziv and Kremer45]. One additional study found no improvement or differences in all actigraphy sleep measures, possibly due to insufficient sleep log data [Reference Lund, Pedersen, Heymann-Szlachcinska, Tuszewska, Bizik and Larsen43].
Insomnia severity was reported by two studies, and they both found improvement within the music group [Reference Ahlberg, Skårberg, Brus and Kjellin49, Reference Lee, Lee, Ahn, Hong and Yoon50]; regarding the between-group comparison, neither acupuncture [Reference Ahlberg, Skårberg, Brus and Kjellin49] nor the ASMR intervention group [Reference Lee, Lee, Ahn, Hong and Yoon50] had statistically significant differences compared to the music groups.
Discussion
The results of this systematic review (n = 1,120 participants) and meta-analysis showed that listening to music can have a beneficial effect on sleep quality in individuals with mental health problems. Few studies reported other sleep outcomes, and there was no clear effect on depressive symptoms and anxiety. The quality of the studies was limited by the difficulty of blinding participants to the intervention and by the unclear blinding of the outcome assessors. Seven of the included studies could be pooled in a meta-analysis. One study was identified as an influential case with a much larger effect than the others. When excluding this case, the meta-analysis showed a moderate reduction in sleep problems with the music intervention compared to treatment as usual or no-intervention control groups.
Our results align with reviews of music and sleep in other populations. Recently, a number of reviews have been published showing the beneficial effects of music on sleep quality in people with poor sleep related to age [Reference Wang, Li, Zheng, Meng, Meng and Wang22], hospitalization [Reference Kakar, Billar, van Rosmalen, Klimek, Takkenberg and Jeekel19, Reference Jespersen, Hansen and Vuust20], and insomnia [Reference Jespersen, Pando-Naude, Koenig, Jennum and Vuust23]. These meta-analyses have found moderate to large positive effects of music on sleep quality. Our meta-analysis showed a moderate effect size, and the size of the effect in the included studies varied substantially, reflecting high heterogeneity. This could indicate that the effect may not be stable among the studies. However, heterogeneity was substantially reduced when excluding one outlier study with an extreme effect [Reference Lu, Chen and Li48]. We could not identify any study design elements or intervention characteristics explaining why the sleep quality effect size was about four times larger than in the other studies. All participants from the outlier study were selected from the same center, and one possible explanation could be that there were unknown confounders that amplified the effect. Still, it remains unclear what may have caused this distinguishable larger effect, therefore, we presented the pooled results with and without the outlier for transparency and completeness purposes.
The characteristics of the music intervention vary among the studies. Specifically, most of the studies used music playlists that were pre-selected by the researchers, and only a few studies allowed the participants to choose. Previous reviews have found no evidence that participants’ influence on the choice of music is crucial for the effect of music on sleep [Reference Jespersen, Pando-Naude, Koenig, Jennum and Vuust23], but music preference has been found essential in other domains, such as the analgesic effects of music [Reference Valevicius, Lépine Lopez, Diushekeeva, Lee and Roy67]. The role of the specific music selection for sleep has not yet been investigated in different populations. It could be that for sleep problems in adults with mental health problems, the effect of music on sleep is facilitated particularly by distraction and emotion regulation, making music preference more important. The intervention music was generally described as soothing and relaxing, and culturally unique music was also implemented by some researchers.
Still, no study reported how well the participants liked the intervention music, and only two studies [Reference Hernández-Ruiz46, Reference Wahbeh and Nelson47] allowed participants to completely select their own music. Another feature that varied substantially was the total intervention exposure, ranging from less than 5 h to up to 30 h. Interestingly, the differences in the intervention time did not always align well with the effect size of the studies. For instance, after we compared different units of music intervention durations (e.g., duration per session, intervention period (days), and total intervention exposure (time per session × days) with both single study effect size and within-group PSQI improvement, we found that longer intervention time was not always associated with a larger effect on sleep improvement (see Supplementary Figures S4.1, S4.2). This is in contrast to previous research indicating a dose–response relationship between music and sleep improvement [Reference Dickson and Schubert68]. A next step for future research would, therefore, be to explore the role of music preference and whether there should be a gold standard when designing music intervention for sleep in individuals with mental health problems. Based on the results of this systematic review, the recommendations for a music sleep intervention in adults with mental health problems would be 30–60 min of listening to soothing music every night at bedtime for a minimum of 5 days. In addition, the severity of mental health problems should be considered in relation to the intervention period.
One limitation of this study was the inclusion of both RCT and non-RCT studies in our meta-analyses. The reason for including both types of studies was to provide a general overview of this topic since we anticipated at the pre-registration stage that there would be limited numbers of qualified studies. Indeed, this was true, and as such, we were not able to quantitatively explore subgroup differences related to the type or severity of mental health problems, the intervention duration, or identify other confounders such as treatment-as-usual medication doses. Another limitation was the possibility that the small sample size could bias the results of the meta-analysis. To further test the robustness of the results, a follow-up Bayesian sensitivity meta-analysis was conducted. The results of the Bayesian approach were consistent with the frequentist approach and thereby support the results of the meta-analysis (see Supplementary Figure S5).
Conclusion
This systematic review and meta-analysis suggest that music interventions could have the potential to improve sleep quality among individuals with mental health problems, even though more high-quality studies are needed to fully establish the effect. Sleep problems are highly prevalent among people with mental health problems and should be addressed independently. In this regard, music may serve as a safe, low-cost, and easily accessible intervention option.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1192/j.eurpsy.2024.1773.
Data availability statement
The data that support the findings of this study are openly available in: https://github.com/Tuesday1234567/music_meta.
Acknowledgements
Center for Music in the Brain is funded by the Danish National Research Foundation (DNRF117).
Author contribution
NZ: Investigation, Data curation, Formal analysis, Visualization, Interpretation, Writing – original draft. HNL: Formal analysis, Interpretation, Writing – review and editing. KVJ: Conceptualization, Methodology, Investigation, Project administration, Supervision, Interpretation, Writing – review and editing.
Financial support
There was no specific funding for this project.
Competing interest
NZ and KVJ declare no conflicts of interest. HNL declares conflicting interests due to ownership and sales of the MusicStar app.
Comments
No Comments have been published for this article.