Measuring neuropsychiatric symptoms in patients with early cognitive decline using speech analysis

Alexandra König; Elisa Mallick; Johannes Tröger; Nicklas Linz; Radia Zeghari; Valeria Manera; Philippe Robert

doi:10.1192/j.eurpsy.2021.2236

Measuring neuropsychiatric symptoms in patients with early cognitive decline using speech analysis

Published online by Cambridge University Press: 13 October 2021

Valeria Manera and

Alexandra König*: Affiliation:
Stars Team, Sophia Antipolis, Institut National de Recherche en Informatique et en Automatique (INRIA), Valbonne, France Clinical Research, ki:elements, Saarbrücken, Germany CoBTeK (Cognition-Behaviour-Technology) Lab, FRIS-University Côte d’Azur, Nice, France
Elisa Mallick: Affiliation:
Stars Team, Sophia Antipolis, Institut National de Recherche en Informatique et en Automatique (INRIA), Valbonne, France Clinical Research, ki:elements, Saarbrücken, Germany CoBTeK (Cognition-Behaviour-Technology) Lab, FRIS-University Côte d’Azur, Nice, France
Johannes Tröger: Affiliation:
Stars Team, Sophia Antipolis, Institut National de Recherche en Informatique et en Automatique (INRIA), Valbonne, France Clinical Research, ki:elements, Saarbrücken, Germany CoBTeK (Cognition-Behaviour-Technology) Lab, FRIS-University Côte d’Azur, Nice, France
Nicklas Linz: Affiliation:
Stars Team, Sophia Antipolis, Institut National de Recherche en Informatique et en Automatique (INRIA), Valbonne, France Clinical Research, ki:elements, Saarbrücken, Germany CoBTeK (Cognition-Behaviour-Technology) Lab, FRIS-University Côte d’Azur, Nice, France
Radia Zeghari: Affiliation:
Stars Team, Sophia Antipolis, Institut National de Recherche en Informatique et en Automatique (INRIA), Valbonne, France Clinical Research, ki:elements, Saarbrücken, Germany CoBTeK (Cognition-Behaviour-Technology) Lab, FRIS-University Côte d’Azur, Nice, France
Valeria Manera: Affiliation:
Stars Team, Sophia Antipolis, Institut National de Recherche en Informatique et en Automatique (INRIA), Valbonne, France Clinical Research, ki:elements, Saarbrücken, Germany CoBTeK (Cognition-Behaviour-Technology) Lab, FRIS-University Côte d’Azur, Nice, France
Philippe Robert: Affiliation:
Stars Team, Sophia Antipolis, Institut National de Recherche en Informatique et en Automatique (INRIA), Valbonne, France Clinical Research, ki:elements, Saarbrücken, Germany CoBTeK (Cognition-Behaviour-Technology) Lab, FRIS-University Côte d’Azur, Nice, France
*: *Author for correspondence: Alexandra König, E-mail: alexandra.konig@inria.fr

Article contents

Abstract
Background
Methods
Results
Conclusions
Introduction
Materials and Methods
Results
Discussion
Supplementary Materials
Data Availability Statement
Funding Statement
Conflicts of Interest
Authorship Contributions
References

Abstract

Background

Certain neuropsychiatric symptoms (NPS), namely apathy, depression, and anxiety demonstrated great value in predicting dementia progression, representing eventually an opportunity window for timely diagnosis and treatment. However, sensitive and objective markers of these symptoms are still missing. Therefore, the present study aims to investigate the association between automatically extracted speech features and NPS in patients with mild neurocognitive disorders.

Methods

Speech of 141 patients aged 65 or older with neurocognitive disorder was recorded while performing two short narrative speech tasks. NPS were assessed by the neuropsychiatric inventory. Paralinguistic markers relating to prosodic, formant, source, and temporal qualities of speech were automatically extracted, correlated with NPS. Machine learning experiments were carried out to validate the diagnostic power of extracted markers.

Results

Different speech variables are associated with specific NPS; apathy correlates with temporal aspects, and anxiety with voice quality—and this was mostly consistent between male and female after correction for cognitive impairment. Machine learning regressors are able to extract information from speech features and perform above baseline in predicting anxiety, apathy, and depression scores.

Conclusions

Different NPS seem to be characterized by distinct speech features, which are easily extractable automatically from short vocal tasks. These findings support the use of speech analysis for detecting subtypes of NPS in patients with cognitive impairment. This could have great implications for the design of future clinical trials as this cost-effective method could allow more continuous and even remote monitoring of symptoms.

Keywords

apathy depression mild neurocognitive disorders neuropsychiatric symptoms speech analysis vocal parameters

Type: Research Article
Information: European Psychiatry , Volume 64 , Issue 1 , 2021 , e64

DOI: https://doi.org/10.1192/j.eurpsy.2021.2236 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: © The Author(s), 2021. Published by Cambridge University Press on behalf of the European Psychiatric Association

Introduction

The amount of patients with cognitive decline and thus, the risk to develop dementia continues to rise as the population ages [Reference Satizabal, Beiser and Seshadri1]. Neuropsychiatric symptoms (NPS) can be defined as behavioral noncognitive disturbances such as depression or agitation and are extremely common in dementia appearing in up to 98% of patients at one point during the disease [Reference Phan, Osae, Morgan, Inyang and Fagan2]. It has been argued that NPS can appear relatively early even prior to dementia diagnosis, representing eventually an opportunity window for timely detection and treatment [Reference Gallagher, Fischer and Iaboni3]. Since pathological changes seem to occur in the brain often before the onset of clinical dementia, the damages due to neurodegeneration may alter in a subtle manner emotion regulation and behavior as cognition starts to decline [Reference Iaboni and Rapoport4]. This suggests that underlying neurobiological mechanisms might underpin the appearance of these symptoms [Reference Livingston, Sommerlad, Orgeta, Costafreda, Huntley and Ames5] contributing to an increased vulnerability and risk for conversion [Reference Liew6].

NPS can cause a great burden for patients and caregivers affecting severely disease course and quality of life [Reference Radue, Walaszek and Asthana7]. Interestingly, even in cognitively normal subjects it has been reported that certain types of NPS may represent promising prognostic factors for predicting cognitive deterioration, especially the affective ones such as depression, apathy, and anxiety [Reference Hu, Shu, Wu, Chen, Hu and Zhang8–Reference Jang, Ho, Blanken, Dutt and Nation10]. Anxiety seems to be an important predictor of amyloid pathology. The presence of apathy may increase the risk of conversion from mild cognitive impairment (MCI) to dementia [Reference Padala, Padala, Lensing, Jackson, Hunter and Parkes11] and a similar association is found with depression [Reference Goukasian, Hwang, Romero, Grotts, Do, Groh, Bateman and Apostolova12]. Differential utilities of NPS for even predicting dementia subtypes are currently intensively investigated [Reference Liew6] since it could lead to earlier tailored treatment.

Despite the immense clinical impact of NPS, these clinical dimensions remain insufficiently identified and therefore untreated. Evaluation and management of NPS remains challenging since its detection is not made systematically in current clinical practice in patients with cognitive decline [Reference Gitlin, Marx, Stanley, Hansen and Van Haitsma9], and these are often assessed only when family members bring their observation to the attention of health care professionals. However, there is evidence that if early treatment is provided, namely in the form of nonpharmacological intervention, it can be effective to reduce the severity of NPS [Reference Radue, Walaszek and Asthana7], and even to slow down cognitive decline [Reference Padala, Padala, Lensing, Jackson, Hunter and Parkes11].

Current assessment methods rely mostly on clinical scales such as the neuropsychiatric inventory (NPI) [Reference Goukasian, Hwang, Romero, Grotts, Do, Groh, Bateman and Apostolova13], which in turn depends heavily on the informants’ objective reporting ability [Reference Cummings, Mega, Gray, Rosenberg-Thompson, Carusi and Gornbein14]. The clinician rating may be influenced as well by caregivers’ personal experiences, and by limited access to reliable information. This is why clinicians and researchers are looking for ways to complement classical assessment scales with more objective indexes, which are able to detect subtle variations in the presentation of NPS as well as its different subtypes [Reference Stella15, Reference König, Aalten, Verhey, Bensadoun, Petit and Robert16].

Mobile technologies may be a promising solution for detecting and monitoring objectively fine-grained changes in behavior and thus, the appearance of NPS in a continuous manner [Reference Gros, Bensamoun, Manera, Fabre, Zacconi-Cauvin and Thummler17]. The use of digital biomarkers which are measures of behavioral data collected by means of digital devices are increasingly studied in patients with cognitive decline as a potential alternative assessment approach [Reference Kourtis, Regele, Wright and Jones18]. One promising avenue lies in recent advances in computational linguistics and language processing that have led to the use of automatic speech analysis in the assessment of various clinical manifestations [Reference Piau, Wild, Mattek and Kaye19, Reference Albuquerque, Valente, Teixeira, Figueiredo, Sa-Couto and Oliveira20] making it potentially useful for assessing different NPS.

Until now, it has been clearly demonstrated that changes in speech and language patterns can be indicative for cognitive decline due to neuropathological processes. Different affective states including depression, anxiety, and apathy alter as well mechanisms involved in speech production, namely variations in muscle tension and tonality which can affect prosody and the quality of speech. Reduced or increased muscle tension will influence vocal tract dynamics as well as articulation behavior [Reference Robin, Harrison, Kaufman, Rudzicz, Simpson and Yancheva21] which may be today with recent advances in speech analysis technologies be easily detectable.

In depression, which is characterized by persistent sadness and anhedonia, accompanied by disturbed sleep/appetite, fatigue, and even suicidal ideation, it is notable by ear that patients show a reduced speech rate and prosody spectrum and sound rather monotonous which could serve as markers, if objective measurements can quantify these observations [Reference Cummins, Scherer, Krajewski, Schnieder, Epps and Quatieri22]. Until now, several groups investigated the use of automatic analysis of speech as an additional assessment tool with an extensive review published by Cummins et al. [Reference Robin, Harrison, Kaufman, Rudzicz, Simpson and Yancheva21] outlining the interest of using speech as a key objective marker for disease progression.

Prosodic, articulatory, and acoustic features of speech seem affected by depression severity and thus can easily be identified and used for continuously monitoring patients. With a considerable overlap of symptoms between depression and apathy, namely the lack of interest and goal-oriented behavior, we anticipate similar results when applying speech technology methods to apathy with a slightly different pattern in regards to emotionally triggered speech.

In anxiety, which is characterized by excessive fear or worrying, many studies found a significant increase in mean fundamental frequency (f0) [Reference Lanctot, Amatniek, Ancoli-Israel, Arnold, Ballard and Cohen-Mansfield23]. Jitter and shimmer were also significantly higher in anxious patients [Reference Low, Bentley and Ghosh24, Reference Özseven, Düğenci, Doruk and Kahraman25]. In a previous study, we showed that certain speech characteristics extracted over the phone correlated with stress levels in both genders; mainly, spectral (i.e., formant) features, such as the mel-frequency cepstral coefficient (MFCC), and prosodic characteristics, such as the fundamental frequency, appeared to be sensitive to stress [Reference Silber-Varod, Kreiner, Lovett, Levi-Belz and Amir26].

In apathy, which is characterized by lack of motivation, decreased initiative, and emotional indifference, we demonstrated that certain paralinguistic features, namely temporal aspects of speech as well as prosodic characteristics correlate significantly with levels of symptom severity [Reference König, Riviere, Linz, Lindsay, Elbaum and Fabre27].

Hence, given the increasingly important role of NPS in early dementia diagnosis [Reference König, Linz, Zeghari, Klinge, Tröger and Alexandersson28], it seems worthwhile investigating the additional value of this method further for differential diagnosis of these most frequently found NPI symptoms. Therefore, the purpose of this study is to determine whether automatic speech analysis and the automatic extraction of speech features can be useful for the early detection of specific NPS—namely depression, apathy, and anxiety—in patients with mild neurocognitive disorders.

Materials and Methods

Participants

All participants were recruited through the Memory Clinic located at the Institut Claude Pompidou in the Nice University Hospital. A total of 141 patients aged 65 or older with mild neurocognitive disorder according to the Diagnostic Statistical Manual 5 (DSM-5) [Reference Cummings, Ritter and Rothenberg29] were included in this study. For this, the presence of cognitive decline in memory and/or executive function with or without interference with performance in activity of daily living was required based on previously performed evaluations. Patients coming to the Memory clinic underwent a clinical assessment including, among others, the Mini-Mental State Examination (MMSE) [30] and the NPI interview [Reference Goukasian, Hwang, Romero, Grotts, Do, Groh, Bateman and Apostolova13].

Study procedure

Speech features vary naturally between males and females. These differences have been leveraged in gender classification through speech analysis based on pitch and formant frequencies [Reference Folstein, Folstein and McHugh31], Harmonic to Noise ratio [Reference Childers and Wu32], and linear predictive components and MFCC [Reference Heffernan33]. Previous work found differences in speech depending on gender in the effects of apathy [Reference Wu and Childers34], as well as depression and the effectiveness of classifiers for its detection [Reference Linz, Klinge, Troger, Alexandersson, Zeghari and Robert35]. This is why this study considers males and females separately resulting in two experimental groups consisting of 92 females and 49 males.

Participants were all native speakers of French and excluded if they had any major auditory or language problems, history of head trauma, loss of consciousness, psychotic or aberrant motor behavior, or history of drug abuse. Written informed consent was obtained from all subjects prior to the experiments. The study was approved by the Nice Ethics Committee (ELEMENT ID RCB 2017-A01896-45, MoTap ID RCB 2017-A01366–47) and was conducted according to the Declaration of Helsinki.

Speech task (positive and negative story)

Free and natural speech tasks are capable of eliciting emotional reactions (or a lack thereof) by asking to describe events that triggered recent affective arousal. Affective arousal is expected to impact speech in NPS since simple vocal exercises or reading tasks do not allow to capture the acoustic effects of alterations in affective states [Reference Robin, Harrison, Kaufman, Rudzicz, Simpson and Yancheva21]. Using emotional induced free speech tasks allows for a greater range of emotional effects such as describing events that have aroused significant emotions [Reference Low, Maddage, Lech, Sheeber and Allen36]. Therefore, the participants were asked to: (a) talk about a positive event in their life and (b) to talk about a negative event in their life. Instructions for the speech tasks (“Can you tell me in one minute about a positive/negative event in your life?”) were prerecorded by one of the psychologists and played from a tablet computer ensuring standardized instruction over both experiments. The answers were recorded with the tablet’s internal microphone.

Processing of speech data

Audio features were extracted directly and automatically from the speech signal. This form of speech analysis does not consider the semantic content of what a participant said, thus increasing the applicability of results in a clinical scenario, as no prior processing, such as transcription of what has been said, is required. For each speech task (positive and negative story), features were extracted separately from different main areas: temporal features including measures of speech proportion (e.g., length of pauses and length of speaking segments), the connectivity of speech segments and general speaking rate; prosodic, relating to long-time variations in perceived stress and rhythm in speech. Prosodic features also measure alterations in personal speaking style (e.g., perceived pitch and intonation of speech); formant features represent the dominant components of the speech spectrum and carry information about the acoustic resonance of the vocal tract and its use. These markers are often indicative of problems with articulatory coordination in speech motor control disorders (ref Sapir); source features relate to the source of voice production, the airflow through the glottal speech production system. Spectral features characterize the speech spectrum; the frequency distribution of the speech signal at a specific time instance information in some high dimensional representation. These features operationalize irregularities in vocal fold movement (e.g., measures of voice quality). An overview and explanation of the extracted speech features can be found in Table 1.

Table 1. Overview and explanation of extracted speech features.

Abbreviation: MFCC, mel-frequency cepstral coefficient.

Statistical analysis

Statistical analysis was run using R software (software version 3.4.02). QQ-plots as well as Shapiro–Wilk tests indicated a violation of the normal distribution for the depression scores. Therefore, we used Spearman rank correlations to test whether speech as well as transcript data was correlated with NPI scores (only for the items apathy, depression, and anxiety). We applied the Benjamini–Hochberg procedure for each of the feature categories when correcting for multiple comparisons. Three variable groups were excluded from this analysis. MFCCs, Delta, and Delta Delta values were not considered in group comparisons, as they do not have directly human understandable explanations and would therefore provide little insight. These variables were included in the machine learning experiments, as explainability is not required.

To predict NPI scores, we train different regression models and evaluate their performance using MAE. Regression models are trained including support vector regression (SVR) and Lasso (Linear Regression with L1 regularization). Implementations were provided by the scikit-learn python framework. Features were normalized by subtraction of their mean and division through their standard deviation. Because of the small data set size a separate validation/test set could not be used. Instead, Leave-One-Out-Cross-Validation was employed. This is a method where N different models are evaluated: train a model on N − 1 observations and test it on one observation. The model is evaluated for every held-out observation. The final result is then calculated by taking the mean of all the individual evaluations.

Results

A total of 141 participants, of which 92 were female and 49 were male, have been included in the analysis. Demographic and clinical information split by gender is available in Table 2. Both groups had an average age around 75 and showed on average some signs of MCI, with a mean MMSE of around 24. The male group scored significantly higher on the apathy domain of the NPI. Correlations between the NPI subscales and the MMSE showed significant correlations for both groups (see Table 3). MMSE was significantly correlated with the NPI apathy subscale for both groups, while only females also showed significant correlations between the anxiety subscale and the MMSE. Both normal and partial correlations, corrected for MMSE, of speech variables and NPI subscales by gender group and speech task are displayed in Figure 1 (a detailed listing of correlations is provided in Supplementary Tables S1 and S2).

Table 2. Demographic data for included participants, split by gender.

Note: Mean and standard deviation are reported.

Abbreviations: MMSE, Mini-Mental State Examination; NPI, neuropsychiatric inventory.

Table 3. Spearman rank correlations between MMSE and NPI subscales for females and males.

Abbreviations: MMSE, Mini-Mental State Examination; NPI, neuropsychiatric inventory.

Figure 1. Plot between (left) Spearman rank correlations and (right) spearman rank partial correlations corrected for Mini-Mental State Examination (MMSE), between audio features and neuropsychiatric inventory (NPI) subscales, separated by gender and voice task. Only significant correlations are reported. Absolute value of correlation is reflected in the size and color (positive correlations in blue; negative correlations in red) of the dot.

The Sound to noise ratio shows significant correlations with the NPI anxiety subscale for both genders and tasks, suggesting that the higher the sound to noise ratio, the stronger symptoms of anxiety are present. This effect is stable after correcting for the influence of MMSE. Before and after correcting for MMSE, the Speech Ratio shows significant correlations with the NPI apathy subscale for both genders and across both tasks, indicating that less speaking during the recording is associated with higher levels of apathy. The same is visible for the Total Phonation Time and Number of Pauses, but effects are not sustained after correcting for MMSE using partial correlations. The Amplitude Kurtosis shows significant correlations with the anxiety and apathy NPI subscales, which are not retained in the partial correlation.

Correlations between all NPI scales and measures of power (Mean power and Total power) are visible for females in the negative story. When correcting for the influence of MMSE, only weaker correlations with the anxiety and depression subscales remain. Performance of trained machine learning models is listed in Table 4 as the MAE of predictions. The baseline for each subscale and gender is reported. For each subscale and gender, a model could be trained that outperforms this baseline by at least 0.3 points on the NPI scale. Features selected by the machine learning models are available in Supplementary Table S3.

Table 4. Mean absolute error of regression methods (linear regression L1 penalization and SVM) and of the baseline, for males and females separately.

Note: Results better than baseline are marked in bold.

Abbreviation: NPI, neuropsychiatric inventory.

Discussion

Identifying early signs of different NPS by the means of objective measurement tools could ultimately lead to better management and treatment. This is particularly important in patients with mild neurocognitive disorders. Indeed, the assessment of NPS is not systematically performed in clinical practice, despite the fact that they represent risk factors for dementia conversion, and thus are increasingly becoming targets of clinical trials [Reference Gupta, Malandrakis, Xiao, Guha, Van Segbroeck and Black37, Reference Roberto, Portella, Marquié, Alegret, Hernández and Mauleón38]. As in the early stages of neurocognitive disorders NPS do not interfere substantially with activities of daily living, patients, and caregivers may not report them unless prompted, which suggests the need for noninvasive, objective, and reliable measures that can help detect subtle changes in NPS.

The results of the present study show that certain changes in spontaneous speech characteristics seem associated with specific NPS. Namely, apathy correlates with temporal aspects of speech such as speech ratio, which means that patients with more severe apathy, speak slower and less. This is consistent with the diagnostic criteria for apathy [Reference Robert, Lanctôt, Agüera-Ortiz, Aalten, Bremond and Defrancesco39], which identified reduction of self-initiated verbal production as one of the core apathy dimensions. Similarly, we found previously strong correlations between this type of features (sound duration, syllable count, etc.) and subdomains of the apathy inventory [Reference König, Riviere, Linz, Lindsay, Elbaum and Fabre27].

In turn, anxiety correlates with voice quality features such as Sound to noise ratio. These results are even mostly consistent between male and females. Importantly, these correlations between apathy and anxiety and the distinct speech variables were still significant after correcting for cognitive impairment, meaning that these specific speech features may be relevant for assessing NPS in patients with different degrees of cognitive deterioration, and do not capture simply cognitive decline. As the presence of NPS increases with cognitive decline progression [Reference König, Linz, Zeghari, Klinge, Tröger and Alexandersson28], it is important to control for the degree of cognitive decline to verify if speech features are relevant specifically to detect NPS. The fact that we did not find any strong correlation between speech features and depression might be due to the small variance on the depression NPI subscale in our sample.

Our findings are consistent with other research in which a significant increase in mean f0 (prosodic related feature) was detected in anxiety disorders [Reference Weeks, Lee, Reilly, Howell, France and Kowalsky40–Reference Weeks, Srivastav, Howell and Menatti42]; however, we found this effect only in females during the negative storytelling. In another study, in which emotional responses were induced by watching videos, females showed higher emotional expressivity for negative stimuli, whereas males showed overall more intense emotional experiences, meaning that the differences in gender seem to depend on the specific emotion type [Reference Weeks, Srivastav, Howell and Menatti43].

Moreover, greater F0 has been reported mostly in males with social anxiety disorder compared to controls and only in females during in vivo social exposure [Reference Galili, Amir and Gilboa-Schechtman42]. This could explain why we found a difference only in the negative task since it may elicit a more emotionally loaded reaction. Jitter and shimmer were also significantly higher in patients with an elevated score on the NPI anxiety subdomain, which is in line with previous studies [Reference Low, Bentley and Ghosh24, Reference Özseven, Düğenci, Doruk and Kahraman25].

Machine learning regressors were able to extract information from speech features and perform above baseline in predicting anxiety, apathy, and depression scores, despite NPS were globally of quite low intensity in our sample. When adding MMSE as a feature, only the model predicting apathy heavily relied on it, which may be partially due to the fact that apathy becomes more common with increased cognitive decline [Reference Deng, Chang, Yang, Huo and Zhou44]. These results may have important clinical implications since for each subscale and gender, a model could be trained that outperforms this baseline by at least 0.3 points on the NPI scale.

The positive and negative storytelling were chosen as semistandardized free speech tasks, that still allow for production of natural speech, while limiting the scope/time and induce an emotional reaction in participants. This in contrast to related work measuring cognitive function from similar populations, that relied on more standardized tasks, such as describing an image or completing a verbal cognitive task.

Similar studies have shown as well that computed acoustic features can eventually be used to track changes in mental health states [Reference Manera, Fabre, Stella, Loureiro, Agüera-Ortiz and López-Álvarez45]. Indeed, speech analytics have been tested successfully in psychiatry for its use to predict treatment response in depressed patients [Reference Arevian, Bone, Malandrakis, Martinez, Wells and Miklowitz46] or changes in mood states in bipolar disorder [Reference Carrillo, Sigman, Fernández Slezak, Ashton, Fitzgerald and Stroud47]. As certain feature types seem to be associated with NPS, the outcome of this study may encourage a potential broader use of automatic speech analysis in other neurocognitive disorders for NPS detection. This could particularly improve the objectivity of first front-line assessment procedures and help overcome barriers that prevent timely access diagnosis and treatment.

In parallel to classical methods, such technologies can be used to establish objective, personalized baseline reference standards to design innovative clinical trials that assess the effectiveness of new treatments [Reference Gideon, Provost and McInnis48]. For instance, the use of wearable devices for additional measurements such as heart rate variability or locomotor activity are increasingly investigated for detecting behavioral abnormalities [Reference Teipel, König, Hoey, Kaye, Krüger and Robillard49]. Combined with speech all these data could help reduce evaluation biases and provide richer understanding of variations in NPS on a day-to-day basis even before severity reaches a level of requiring intervention [Reference Reinertsen and Clifford50, Reference Karow, Pajonk, Reimer, Hirdes, Osterwald and Naber51]. In regards to the new rise of decentralized clinical trials, digital technologies will play a major role to complement traditional data acquisition enabling the redesign and adaptation of trials while still ongoing [Reference Copeland, Zeber, Salloum, Pincus, Fine and Kilbourne52].

One limitation of our study is that patients in the sample showed a lot of variability in the MMSE score even if they all met the diagnosis criteria for mild cognitive disorder, which could have affected the results. In addition, patients had no severe NPS—rarely they scored above four, which is the original cut-off. However, taking into account the predictive value of behavioral symptoms [Reference Chow53], it is therefore important to investigate if the technology is able to detect subtle signs of symptoms since this may be even more of interest for early prevention strategies.

Another limitation is that we employed only the NPI scale for the assessment of NPS. Despite the fact that it is widely used and considered as a gold standard for neuropsychiatric assessment, the NPI is probably not the ideal scale for early screening, and to quantify fine changes. Recently, the Mild behavioral Inventory[Reference Taragano, Allegri, Heisecke, Martelli, Feldman and Sánchez54] has been proposed, that could better capture early signs of NPS, and thus may show stronger correlations with speech features in patients with mild NPS symptoms, as it is often the case for people with mild neurocognitive disorders. In addition, a larger study including more patients and more severe NPS symptoms should be replicated in order to validate the found speech markers.

Supplementary Materials

To view supplementary material for this article, please visit http://dx.doi.org/10.1192/j.eurpsy.2021.2236.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors would like to thank all participants of this study. This research is part of the MNC3 program of the University Cote d’Azur IDEX Jedi.

Funding Statement

It was partially funded by the MEPHESTO project (Grant DRI-0120105/291), the EIT Digital Well-being Activity (Grant 17074), ELEMENT, the University Côte d’Azur, and the IA Association, and supported by the Edmond & Lily Safra Foundation and Institute Claude Pompidou.

Conflicts of Interest

E.M., J.T., and N.L. are employees and shareholders of ki elements UG.

Authorship Contributions

Conceptualization: J.T., P.R.; Formal analysis: A.K., E.M.; Funding acquisition: P.R.; Investigation: P.R.; Methodology: A.K., N.L.; Validation: A.K.; Visualization: N.L.; Writing—original draft: A.K., V.M.; Writing—review and editing: R.Z.

References

Satizabal, C, Beiser, AS, Seshadri, S. Incidence of dementia over three decades in the Framingham heart study. N Engl J Med. 2016;375:93–4.Google Scholar PubMed

Phan, SV, Osae, S, Morgan, JC, Inyang, M, Fagan, SC. Neuropsychiatric symptoms in dementia: considerations for pharmacotherapy in the USA. Drugs R D. 2019;19:93–115.CrossRef Google Scholar PubMed

Gallagher, D, Fischer, CE, Iaboni, A. Neuropsychiatric symptoms in mild cognitive impairment. Can J Psychiatr. 2017;62:161–9.CrossRef Google Scholar PubMed

Iaboni, A, Rapoport, MJ. Detecting and managing neuropsychiatric symptoms in dementia. Can J Psychiatr. 2017;62:158–60.CrossRef Google Scholar PubMed

Livingston, G, Sommerlad, A, Orgeta, V, Costafreda, SG, Huntley, J, Ames, D, et al. Dementia prevention, intervention, and care. Lancet. 2017;390:2673–734.CrossRef Google Scholar PubMed

Liew, TM. Neuropsychiatric symptoms in cognitively normal older persons, and the association with Alzheimer’s and non-Alzheimer’s dementia. Alzheimers Res Ther. 2020;12:35. doi:10.1186/s13195-020-00604-7.Google Scholar PubMed

Radue, R, Walaszek, A, Asthana, S. Neuropsychiatric symptoms in dementia. Handb Clin Neurol. 2019;167:437–54.CrossRef Google Scholar PubMed

Hu, M, Shu, X, Wu, X, Chen, F, Hu, H, Zhang, J, et al. Neuropsychiatric symptoms as prognostic makers for the elderly with mild cognitive impairment: a meta-analysis. J Affect Disord. 2020;271:185–92.CrossRef Google Scholar PubMed

Gitlin, LN, Marx, KA, Stanley, IH, Hansen, BR, Van Haitsma, KS. Assessing neuropsychiatric symptoms in people with dementia: a systematic review of measures. Int Psychogeriatr. 2014;26:1805–48.CrossRef Google Scholar PubMed

Jang, JY, Ho, JK, Blanken, AE, Dutt, S, Nation, DA. Alzheimer’s disease neuroimaging initiative. Affective neuropsychiatric symptoms as early signs of dementia risk in older adults. J Alzheimers Dis. 2020;77(3):1195–207.Google Scholar

Padala, PR, Padala, KP, Lensing, SY, Jackson, AN, Hunter, CR, Parkes, CM, et al. Repetitive transcranial magnetic stimulation for apathy in mild cognitive impairment: a double-blind, randomized, sham-controlled, cross-over pilot study. Psychiatry Res. 2018;261:312–8.CrossRef Google Scholar PubMed

Goukasian, N, Hwang, KS, Romero, T, Grotts, J, Do, TM, Groh, JR, Bateman, DR, Apostolova, LG. Association of brain amyloidosis with the incidence and frequency of neuropsychiatric symptoms in ADNI: a multisite observational cohort study. BMJ Open. 2019;9(12):e031947.CrossRef Google Scholar PubMed

Cummings, JL, Mega, M, Gray, K, Rosenberg-Thompson, S, Carusi, DA, Gornbein, J. The neuropsychiatric inventory: comprehensive assessment of psychopathology in dementia. Neurology. 1994;44:2308–14.CrossRef Google Scholar PubMed

Stella, F. Assessment of neuropsychiatric symptoms in dementia: toward improving accuracy. Dement Neuropsychol. 2013;7:244–51.CrossRef Google Scholar PubMed

König, A, Aalten, P, Verhey, F, Bensadoun, G, Petit, P-D, Robert, P, et al. A review of current information and communication technologies: can they be used to assess apathy? Int J Geriatr Psychiatry. 2014;29:345–58.CrossRef Google Scholar PubMed

Gros, A, Bensamoun, D, Manera, V, Fabre, R, Zacconi-Cauvin, A-M, Thummler, S, et al. Recommendations for the use of ICT in elderly populations with affective disorders. Front Aging Neurosci. 2016;8:269.CrossRef Google Scholar PubMed

Kourtis, LC, Regele, OB, Wright, JM, Jones, GB. Digital biomarkers for Alzheimer’s disease: the mobile/wearable devices opportunity. NPJ Dig Med. 2019;2:9. doi:10.1038/s41746-019-0084-2.CrossRef Google Scholar PubMed

Piau, A, Wild, K, Mattek, N, Kaye, J. Current state of digital biomarker technologies for real-life, home-based monitoring of cognitive function for mild cognitive impairment to mild Alzheimer disease and implications for clinical care: systematic review. J Med Internet Res. 2019;21:e12785.CrossRef Google Scholar PubMed

Albuquerque, L, Valente, ARS, Teixeira, A, Figueiredo, D, Sa-Couto, P, Oliveira, C. Association between acoustic speech features and non-severe levels of anxiety and depression symptoms across lifespan. PLoS One. 2021;16:e0248842. doi:10.1371/journal.pone.0248842.CrossRef Google Scholar PubMed

Robin, J, Harrison, JE, Kaufman, LD, Rudzicz, F, Simpson, W, Yancheva, M. Evaluation of speech-based digital biomarkers: review and recommendations. Digit Biomark. 2020;4:99–108.CrossRef Google Scholar PubMed

Cummins, N, Scherer, S, Krajewski, J, Schnieder, S, Epps, J, Quatieri, TF. A review of depression and suicide risk assessment using speech analysis. Speech Comm. 2015;71:10–49. doi:10.1016/j.specom.2015.03.004.CrossRef Google Scholar

Lanctot, KL, Amatniek, J, Ancoli-Israel, S, Arnold, SE, Ballard, C, Cohen-Mansfield, J, et al. Neuropsychiatric signs and symptoms of Alzheimer’s disease: new treatment paradigms. Alzheimers Dement (N Y). 2017;3(3):440–9.CrossRef Google Scholar PubMed

Low, DM, Bentley, KH, Ghosh, SS. Automated assessment of psychiatric disorders using speech: a systematic review. Laryngoscope Investig Otolaryngol. 2020;5:96–116.CrossRef Google Scholar PubMed

Özseven, T, Düğenci, M, Doruk, A, Kahraman, Hİ. Voice traces of anxiety: acoustic parameters affected by anxiety disorder. Arch Acoust. 2018;43(4):625–36.Google Scholar

Silber-Varod, V, Kreiner, H, Lovett, R, Levi-Belz, Y, Amir, N. Do social anxiety individuals hesitate more? The prosodic profile of hesitation disfluencies in social anxiety disorder individuals. Speech Pros. 2016;2016:1211–15. doi:10.21437/speechprosody.2016-249.Google Scholar

König, A, Riviere, K, Linz, N, Lindsay, H, Elbaum, J, Fabre, R, et al. Measuring stress in health professionals over the phone using automatic speech analysis during the COVID-19 pandemic: observational pilot study. J Med Internet Res. 2021;23:e24191.CrossRef Google Scholar PubMed

König, A, Linz, N, Zeghari, R, Klinge, X, Tröger, J, Alexandersson, J, et al. Detecting apathy in older adults with cognitive disorders using automatic speech analysis. J Alzheimers Dis. 2019;69:1183–93. doi:10.3233/jad-181033.CrossRef Google Scholar PubMed

Cummings, J, Ritter, A, Rothenberg, K. Advances in management of neuropsychiatric syndromes in neurodegenerative diseases. Curr Psychiatry Rep. 2019;21:79.CrossRef Google Scholar PubMed

American Psychiatric Association. Diagnostic and statistical manual of mental disorders (DSM-5®). Arlington, VA: American Psychiatric Association; 2013.Google Scholar

Folstein, MF, Folstein, SE, McHugh, PR. Mini-mental state: a practical method for grading the cognitive state of patients for the clinician; 1975.CrossRef Google Scholar

Childers, DG, Wu, K. Gender recognition from speech. Part II: fine analysis. J Acoust Soc Am. 1991;90:1841–56.Google Scholar PubMed

Heffernan, K. Evidence from HNR that/s/is a social marker of gender. Toronto Working Papers in Linguistics; 2004, p. 23.Google Scholar

Wu, K, Childers, DG. Gender recognition from speech. Part I: coarse analysis. J Acoust Soc Am. 1991;90:1828–40.CrossRef Google Scholar PubMed

Linz, N, Klinge, X, Troger, J, Alexandersson, J, Zeghari, R, Robert, P, et al. Automatic detection of apathy using acoustic markers extracted from free emotional speech. Nd Workshop on AI for Ageing, Rehabilitation and Independent Assisted Living (ARIAL); 2018, p. 17–21.Google Scholar

Low, L-SA, Maddage, MC, Lech, M, Sheeber, LB, Allen, NB. Detection of clinical depression in adolescents’ speech during family interactions. IEEE Trans Biomed Eng. 2011;58:574–86. doi:10.1109/tbme.2010.2091640.CrossRef Google Scholar PubMed

Gupta, R, Malandrakis, N, Xiao, B, Guha, T, Van Segbroeck, M, Black, M, et al. Multimodal prediction of affective dimensions and depression in human–computer interactions. In: Proceedings of the 4th ACM international workshop on audio/visual emotion challenge (AVEC’14). Orlando, FL: ACM; 2014, p. 33–40.CrossRef Google Scholar

Roberto, N, Portella, MJ, Marquié, M, Alegret, M, Hernández, I, Mauleón, A, et al. Neuropsychiatric profiles and conversion to dementia in mild cognitive impairment, a latent class analysis. Sci Rep. 2021;11:6448.CrossRef Google Scholar PubMed

Miller, DS, Robert, P, Ereshefsky, L, Adler, L, Bateman, D, Cummings, J, et al. Diagnostic criteria for apathy in neurocognitive disorders. Alzheimers Dement. 2021. doi:10.1002/alz.12358.CrossRef Google Scholar PubMed

Robert, P, Lanctôt, KL, Agüera-Ortiz, L, Aalten, P, Bremond, F, Defrancesco, M, et al. Is it time to revise the diagnostic criteria for apathy in brain disorders? The 2018 international consensus group. Eur Psychiatry. 2018;54:71–6.CrossRef Google Scholar PubMed

Weeks, JW, Lee, C-Y, Reilly, AR, Howell, AN, France, C, Kowalsky, JM, et al. “The sound of fear”: assessing vocal fundamental frequency as a physiological indicator of social anxiety disorder. J Anxiety Disord. 2012;26:811–22.CrossRef Google Scholar PubMed

Galili, L, Amir, O, Gilboa-Schechtman, E. Acoustic properties of dominance and request utterances in social anxiety. J Soc Clin Psychol. 2013;32:651–73. doi:10.1521/jscp.2013.32.6.651.CrossRef Google Scholar

Weeks, JW, Srivastav, A, Howell, AN, Menatti, AR. “Speaking more than words”: classifying men with social anxiety disorder via vocal acoustic analyses of diagnostic interviews. J Psychopathol Behav Assess. 2016;38:30–41. doi:10.1007/s10862-015-9495-9.Google Scholar

Deng, Y, Chang, L, Yang, M, Huo, M, Zhou, R. Gender differences in emotional response: inconsistency between experience and expressivity. PLoS One. 2016;11:e0158666.CrossRef Google Scholar PubMed

Manera, V, Fabre, R, Stella, F, Loureiro, JC, Agüera-Ortiz, L, López-Álvarez, J, et al. A survey on the prevalence of apathy in elderly people referred to specialized memory centers. Int J Geriatr Psychiatry. 2019;34:1369–77.CrossRef Google Scholar PubMed

Arevian, AC, Bone, D, Malandrakis, N, Martinez, VR, Wells, KB, Miklowitz, DJ, et al. Clinical state tracking in serious mental illness through computational analysis of speech. PLoS One. 2020;15:e0225695.CrossRef Google Scholar

Carrillo, F, Sigman, M, Fernández Slezak, D, Ashton, P, Fitzgerald, L, Stroud, J, et al. Natural speech algorithm applied to baseline interview data can predict which patients will respond to psilocybin for treatment-resistant depression. J Affect Disord. 2018;230:84–6.CrossRef Google Scholar PubMed

Gideon, J, Provost, EM, McInnis, M. Mood state prediction from speech of varying acoustic quality for individuals with bipolar disorder. Proc IEEE Int Conf Acoust Speech Signal Process. 2016;2016:2359–63.Google Scholar PubMed

Teipel, S, König, A, Hoey, J, Kaye, J, Krüger, F, Robillard, JM, et al. Use of nonintrusive sensor-based information and communication technology for real-world evidence for clinical trials in dementia. Alzheimers Dement. 2018;14:1216–31.CrossRef Google Scholar PubMed

Reinertsen, E, Clifford, GD. A review of physiological and behavioral monitoring with digital sensors for neuropsychiatric illnesses. Physiol Meas. 2018;39:05TR01.CrossRef Google Scholar PubMed

Karow, A, Pajonk, F-G, Reimer, J, Hirdes, F, Osterwald, C, Naber, D, et al. The dilemma of insight into illness in schizophrenia: self- and expert-rated insight and quality of life. Eur Arch Psychiatry Clin Neurosci. 2008;258:152–9.CrossRef Google Scholar PubMed

Copeland, LA, Zeber, JE, Salloum, IM, Pincus, HA, Fine, MJ, Kilbourne, AM. Treatment adherence and illness insight in veterans with bipolar disorder. J Nerv Ment Dis. 2008;196:16–21.CrossRef Google Scholar PubMed

Chow, S-C. Adaptive clinical trial design. Annu Rev Med. 2014;65:405–15. doi:10.1146/annurev-med-092012-112310.CrossRef Google Scholar

van Dalen, JW, van Wanrooij, LL, Moll van Charante, EP, Brayne, C, van Gool, WA, Richard, E. Association of apathy with risk of incident dementia: a systematic review and meta-analysis. JAMA Psychiat. 2018;75:1012–21.CrossRef Google Scholar PubMed

Taragano, FE, Allegri, RF, Heisecke, SL, Martelli, MI, Feldman, ML, Sánchez, V, et al. Risk of conversion to dementia in a mild behavioral impairment group compared to a psychiatric group and to a mild cognitive impairment group. J Alzheimers Dis. 2018;62:227–38.Google Scholar

Table 1. Overview and explanation of extracted speech features.

Table 2. Demographic data for included participants, split by gender.

Table 3. Spearman rank correlations between MMSE and NPI subscales for females and males.

Table 4. Mean absolute error of regression methods (linear regression L1 penalization and SVM) and of the baseline, for males and females separately.

König et al. supplementary material

Tables S1-S3

File 50.7 KB

Submit a response

Comments

No Comments have been published for this article.

Article contents

Measuring neuropsychiatric symptoms in patients with early cognitive decline using speech analysis

Abstract

Keywords

Introduction

Materials and Methods

Participants

Study procedure

Speech task (positive and negative story)

Processing of speech data

Statistical analysis

Results

Discussion

Supplementary Materials

Data Availability Statement

Acknowledgments

Funding Statement

Conflicts of Interest

Authorship Contributions

References

König et al. supplementary material

Comments

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests