Hostname: page-component-cd9895bd7-hc48f Total loading time: 0 Render date: 2024-12-26T04:00:43.049Z Has data issue: false hasContentIssue false

Using machine learning with passive wearable sensors to pilot the detection of eating disorder behaviors in everyday life

Published online by Cambridge University Press:  20 October 2023

C. Ralph-Nearman*
Affiliation:
Department of Psychological & Brain Sciences, University of Louisville, Louisville, KY, USA
L. E. Sandoval-Araujo
Affiliation:
Department of Psychological & Brain Sciences, University of Louisville, Louisville, KY, USA
A. Karem
Affiliation:
Department of Computer Science and Engineering, University of Louisville, Louisville, KY, USA
C. E. Cusack
Affiliation:
Department of Psychological & Brain Sciences, University of Louisville, Louisville, KY, USA
S. Glatt
Affiliation:
Department of Psychological & Brain Sciences, University of Louisville, Louisville, KY, USA
M. A. Hooper
Affiliation:
Department of Psychological & Brain Sciences, University of Louisville, Louisville, KY, USA Department of Psychology, Vanderbilt University, Nashville, TN, USA
C. Rodriguez Pena
Affiliation:
Department of Computer Science and Engineering, University of Louisville, Louisville, KY, USA
D. Cohen
Affiliation:
Department of Psychological & Brain Sciences, University of Louisville, Louisville, KY, USA
S. Allen
Affiliation:
Department of Electrical and Computer Engineering, University of Louisville, Louisville, KY, USA
E. D. Cash
Affiliation:
Department of Otolaryngology-HNS and Communicative Disorders, University of Louisville School of Medicine, Louisville, KY, USA University of Louisville Healthcare-Brown Cancer Center, Louisville, KY, USA
K. Welch
Affiliation:
Department of Electrical and Computer Engineering, University of Louisville, Louisville, KY, USA
C. A. Levinson
Affiliation:
Department of Psychological & Brain Sciences, University of Louisville, Louisville, KY, USA Department of Pediatrics, Child and Adolescent Psychology and Psychiatry, University of Louisville, Louisville, KY, USA
*
Corresponding author: C. Ralph-Nearman; Email: Christina.Ralph-Nearman@Louisville.edu; ChristinaRalphNearman@gmail.com
Rights & Permissions [Opens in a new window]

Abstract

Background

Eating disorders (ED) are serious psychiatric disorders, taking a life every 52 minutes, with high relapse. There are currently no support or effective intervention therapeutics for individuals with an ED in their everyday life. The aim of this study is to build idiographic machine learning (ML) models to evaluate the performance of physiological recordings to detect individual ED behaviors in naturalistic settings.

Methods

From an ongoing study (Final N = 120), we piloted the ability for ML to detect an individual's ED behavioral episodes (e.g. purging) from physiological data in six individuals diagnosed with an ED, all of whom endorsed purging. Participants wore an ambulatory monitor for 30 days and tapped a button to denote ED behavioral episodes. We built idiographic (N = 1) logistic regression classifiers (LRC) ML trained models to identify onset of episodes (~600 windows) v. baseline (~571 windows) physiology (Heart Rate, Electrodermal Activity, and Temperature).

Results

Using physiological data, ML LRC accurately classified on average 91% of cases, with 92% specificity and 90% sensitivity.

Conclusions

This evidence suggests the ability to build idiographic ML models that detect ED behaviors from physiological indices within everyday life with a high level of accuracy. The novel use of ML with wearable sensors to detect physiological patterns of ED behavior pre-onset can lead to just-in-time clinical interventions to disrupt problematic behaviors and promote ED recovery.

Type
Original Article
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press

Introduction

Eating disorders (EDs) are serious psychiatric illnesses, taking a life every 52 minutes (Deloitte Access Economics, 2020). Only ~50% of adults with EDs respond to existing evidence-based treatments (Bulik, Berkman, Brownley, Sedway, & Lohr, Reference Bulik, Berkman, Brownley, Sedway and Lohr2007; Keel & Mitchell, Reference Keel and Mitchell1997; Steinhausen, Reference Steinhausen2009; Steinhausen & Weber, Reference Steinhausen and Weber2009; van den Berg et al., Reference van den Berg, Houtzager, de Vos, Daemen, Katsaragaki, Karyotaki and Dekker2019). Relapse is common (Walsh, Xu, Wang, Attia, & Kaplan, Reference Walsh, Xu, Wang, Attia and Kaplan2021), which contributes to the ‘revolving door phenomenon’ where individuals cyclically admit and discharge from ED treatment. Given the high mortality rate associated with EDs (Arcelus, Mitchell, Wales, & Nielsen, Reference Arcelus, Mitchell, Wales and Nielsen2011) and the financial burden of accessing treatment (Deloitte Access Economics, 2020), improved treatment that is effective for more individuals seeking recovery from EDs is critical. Three factors contribute to the suboptimal treatment response and high relapse rates: (1) heterogeneity among those with EDs, even within the same diagnostic group (Levinson et al., Reference Levinson, Hunt, Christian, Williams, Keshishian, Vanzhula and Ralph-Nearman2022; Thompson, Berg, & Shatford, Reference Thompson, Berg and Shatford1987); (2) lack of day-to-day and moment-to-moment support in an individual's naturalistic environment; and (3) the lack of knowledge of how individual heterogeneity in cognitions, emotions, and behaviors impact recovery. Thus, there is a need for passive treatment support methods specifically available for use in an individual's everyday life.

Understanding psychological phenomena in naturalistic settings represents one approach that could lead to improved treatments and reduced relapse rates. Technological advancements have facilitated new areas of clinical research that better bridge empirical research to ‘everyday life’, including the use of ambulatory assessment (e.g. ecological momentary assessment [EMA]; Stone and Shiffman, Reference Stone and Shiffman1994) to inform psychological interventions (Wright & Zimmermann, Reference Wright and Zimmermann2019). One of the most common methods of ambulatory assessment used in the past decade is time-intensive repeated survey measurements delivered through mobile application devices (e.g., Hasselhorn, Ottenstein, and Lischetzke, Reference Hasselhorn, Ottenstein and Lischetzke2022). EMA designs have been increasingly implemented in investigating EDs, yet most EMA research has relied solely on self-report instruments (Presseller, Patarinski, Fan, Lampe, & Juarascio, Reference Presseller, Patarinski, Fan, Lampe and Juarascio2022; Schaefer, Engel, & Wonderlich, Reference Schaefer, Engel and Wonderlich2020; Smith et al., Reference Smith, Mason, Juarascio, Schaefer, Crosby, Engel and Wonderlich2019). Though EMA ostensibly captures dynamic symptom relations in everyday life, intensive self-report approaches have some limitations, such as: participant burden from participants repeatedly completing surveys, subjectivity, social desirability, problems with self-reflection in EDs, and the lack of complete understanding regarding how momentary changes in other relevant process, such as neurocognition, location, and biology may impact ED symptoms (Smith et al., Reference Smith, Mason, Juarascio, Schaefer, Crosby, Engel and Wonderlich2019). Collectively, EMA work suggests that ED behaviors and cognitions fluctuate as a function of time and person (Lavender et al., Reference Lavender, de Young, Wonderlich, Crosby, Engel, Mitchell and le Grange2013; Levinson et al., Reference Levinson, Hunt, Christian, Williams, Keshishian, Vanzhula and Ralph-Nearman2022), highlighting the importance of repeated assessment and analysis at the individual level due to the high heterogeneity from person-to-person even within the same ED diagnostic category.

Along with these explicit momentary measures there are more implicit approaches, such as research examining physiological correlates of EDs, which has largely been conducted in laboratory settings (Christian, Cash, Cohen, Trombley, & Levinson, Reference Christian, Cash, Cohen, Trombley and Levinson2023; Presseller et al., Reference Presseller, Patarinski, Fan, Lampe and Juarascio2022). However, such studies related to EDs are sparse, and laboratory-based designs lack the ecological validity needed for support and interventions in individuals' everyday lives. Wearable sensors, such as the E4 Empatica band, have demonstrated good reliability and validity to be used to passively collect common physiological data such as blood-volume pulse (heart rate [HR]), electrodermal activity ([EDA]; skin resistance and conductance variation, controlled by the sympathetic nervous system in response to arousal), and peripheral skin temperature (PST) in everyday life in a variety of populations (e.g., McCarthy, Pradhan, Redpath, and Adler, Reference McCarthy, Pradhan, Redpath and Adler2016; Park, Jeong, Park, and Lee, Reference Park, Jeong, Park and Lee2019; Ragot, Martin, Em, Pallamin, and Diverrez, Reference Ragot, Martin, Em, Pallamin and Diverrez2018; Ravindran et al., Reference Ravindran, Della Monica, Atzori, Lambert, Revell and Dijk2022; Schuurmans et al., Reference Schuurmans, de Looff, Nijhof, Rosada, Scholte, Popma and Otten2020; van Lier et al., Reference van Lier, Pieterse and Garde2020). Preliminary findings using wearable sensor devices reveal physiological recordings that correspond with self-reported ED symptoms. For example, continuous glucose monitoring demonstrates that glucose levels correspond with self-reported bulimic symptoms of fasting, binge eating, and purging (Presseller, Parker, Lin, Weimer, & Juarascio, Reference Presseller, Parker, Lin, Weimer and Juarascio2020). As individuals with ED may be particularly limited in their ability to accurately know and report their emotions, cognitions, and behaviors in the moment, evidence suggests that physiological recordings may be used to accurately provide real-time assessment of loss of control eating (Ranzenhofer et al., Reference Ranzenhofer, Engel, Crosby, Haigney, Anderson, McCaffery and Tanofsky-Kraff2016) without the burden or limitations of intensive self-report. Additionally, a new review (Ralph-Nearman et al., Reference Ralph-Nearman, Osborn, Chang and Barber2023, under review) demonstrates that physiological patterns often differ between different types of ED diagnoses and behaviors (e.g. restriction v. purging/compensation). These results specifically suggest that EDs characterized by purging behaviors (e.g., BN) have changes in HR reactivity, EDA responses, and PST associated to ED-related stimuli, which may differ from other EDs characterized by restriction or binge-eating (Krantz et al., Reference Krantz, Blalock, Tanganyika, Farasat, McBride and Mehler2020; Ortega-Roldan, Rodríguez-Ruiz, Perakakis, Fernandez-Santaella, & Vila, Reference Ortega-Roldan, Rodríguez-Ruiz, Perakakis, Fernandez-Santaella and Vila2014; Papežová, Yamamotova, & Uher, Reference Papežová, Yamamotova and Uher2005). Thus, physiological recordings, such as HR, EDA, and PST, obtained in individuals’ daily lives detected with enough time pre-onset of ED behaviors may preemptively indicate ED symptom engagement and, subsequently, precise time points of intervention before an individual is aware of these changes (Juarascio, Parker, Lagacey, & Godfrey, Reference Juarascio, Parker, Lagacey and Godfrey2018; Levinson, Christian, Shankar-Ram, Brosof, & Williams, Reference Levinson, Christian, Shankar-Ram, Brosof and Williams2019; Smith & Juarascio, Reference Smith and Juarascio2019).

More recent implicit approaches have also leveraged passive data collection (e.g. wearable sensors, smartphones) with machine learning (ML) to detect, predict, or intervene on clinical phenomena in other types of disorders outside EDs. For example, passive physiological recordings, such as those utilized in the present pilot study (HR, EDA, PST), have been shown to be able to detect or predict depressive symptoms (Zarate, Stavropoulos, Ball, de Sena Collier, & Jacobson, Reference Zarate, Stavropoulos, Ball, de Sena Collier and Jacobson2022), variability in anxiety and avoidance symptoms (Jacobson & Bhattacharya, Reference Jacobson and Bhattacharya2022), changes in perceived safety and discomfort (Welch et al., Reference Welch, Warning, Narayanan, Nethala, Do, Vanaparthy and Daisey2022), and behavioral outbursts in real-time in samples with major depressive disorder, anxiety disorders, and autism spectrum disorder (Alban et al., Reference Alban, Ayesh, Alhaddad, Khalid Al-Ali, So, Connor and Cabibihan2021; Goodwin, Mazefsky, Ioannidis, Erdogmus, & Siegel, Reference Goodwin, Mazefsky, Ioannidis, Erdogmus and Siegel2019; Northrup et al., Reference Northrup, Goodwin, Peura, Chen, Taylor, Siegel and Mazefsky2022; Welch et al., Reference Welch, Pennington, Vanaparthy, Do, Narayanan, Popa and Kuravackel2023). Such research has demonstrated robust detective and predictive performance, which has led to providing preventative support and real-time intervention for unhelpful behaviors across a range of mental and physical issues (e.g. Clifton, Clifton, Pimentel, Watkinson, and Tarassenko, Reference Clifton, Clifton, Pimentel, Watkinson and Tarassenko2014; Regalia, Onorati, Lai, Caborni, and Picard, Reference Regalia, Onorati, Lai, Caborni and Picard2019).

Importantly, knowledge on EDs predominately rests upon scholarship that uses explanatory methods (e.g. statistical inference) (Wang, Reference Wang2021). However, to advance the ED field from modeling existing data to forecasting unseen data (i.e. future behavior), ML approaches are well-suited for detecting the onset of ED behavior and, potentially, in turn, allowing for the delivery of evidence-based intervention before relapse may occur (Wang, Reference Wang2021).

In this pilot we aim to use minimal physiological features, none requiring extensive data cleaning, with machine learning, in order to be easily implemented and used in the future within the real-world context. With the future aims in mind, scientifically we choose to start with these three measures (HR, EDA, and PST) which are shown to be able to detect or predict other problematic symptoms and behaviors. HR, EDA, and PST are features which are commonly detected from wearables and have been shown to have the ability to detect or predict depressive symptoms, changes in perceived safety and discomfort, and behavioral outbursts in real-time.

Thus, the current pilot study uniquely examines the detective performance of ML models using physiological recordings collected via a wearable sensor device to classify ED behaviors among six individuals diagnosed with an ED who endorsed purging behaviors. As research on physiological recordings from wearable sensors (Levinson et al., Reference Levinson, Christian, Shankar-Ram, Brosof and Williams2019; Presseller et al., Reference Presseller, Patarinski, Fan, Lampe and Juarascio2022) and ML is limited in ED literature (Wang, Reference Wang2021), the current study was an exploratory pilot. Our principal aim was to build idiographic (N = 1) detective models using passively collected physiological recordings commonly detected from wearables (HR, EDA, and PST, e.g. Christian et al., Reference Christian, Cash, Cohen, Trombley and Levinson2023; Welch et al., Reference Welch, Pennington, Vanaparthy, Do, Narayanan, Popa and Kuravackel2023) to evaluate ML model performance in detecting individual ED behaviors in their everyday life.

Method

Participants

A convenience sample of the first six participants, ranging from 20 to 38 years old (Mage = 29.5; s.d. = 7.3) who were diagnosed with an ED endorsing purging with adequate data, were selected from an ongoing study (N = 120) for the present pilot. This current pilot sample was recruited, and data collected representatively of the participants within the on-going study. The ongoing study recruited individuals with EDs residing in the United States to participate in a study that uses mobile and sensor technology to predict ED relapse and recovery. Participants were recruited via advertisements on social media, alumni lists from treatment centers, and research participants who have consented to be contacted about research studies. Participants were compensated with an Amazon or Target gift card valued between $25 to $110 for their participation based on the number of days the sensor band was worn and the number of EMA surveys completed.

Procedure

All study procedures were approved by the University of Louisville's Institutional Review Board (22.0503), and all participants provided verbal informed consent. Participants completed an initial phone screening to determine eligibility. To participate, individuals had to meet criteria for an active or partial-remission diagnosis of anorexia nervosa (AN), bulimia nervosa (BN), or otherwise specified ED-atypical AN (OSFED-AAN). Exclusion criteria were: active suicidality, mania, psychosis or medical instability (i.e., endorsing current chest pain, dizziness, shortness of breath, blurred vision, seeing dark spots within the past 24 hours, purging more than three times within the past 24 hours, or consuming fewer than 500 calories within the past 48 hours). Eligible participants completed an online baseline questionnaire where they reported ED and comorbid psychiatric symptoms. The research team mailed participants an Empatica E4 wristband, which participants were instructed to wear for 30 days and to use to indicate ED behaviors, charging the band at night.

Measures

Demographics and diagnoses

Participants self-reported their gender, race, age, comorbidities, and socioeconomic status, and all participants endorsed using medication (7 psychiatric and 2 non-psychiatric medications; see Tables 1 and 2). The Structured Clinical Interview for DSM-5 (SCID-5; First, Williams, Karg, and Spitzer, Reference First, Williams, Karg and Spitzer2015) ED modules and the ED Diagnostic Scale (EDDS; Stice, Telch, and Rizvi, Reference Stice, Telch and Rizvi2000, Reference Stice, Fisher and Martinez2004) were used to determine eligibility criteria and current ED diagnosis.

Table 1. Sociodemographic characteristics of participants

Note. AN, anorexia nervosa; BN, bulimia nervosa; A, Atypical; BP, Binge-Purge; p, Purge; OSFED, other specified feeding or eating disorder; OCD, obsessive compulsive disorder; PTSD, post-traumatic stress disorder; GAD, generalized anxiety disorder. ABNP (OSFED) meets all the criteria for BN, with less frequency/duration (<1 × per week/<3 months).

Table 2. Descriptives of frequency of ED behaviors, days and time band was worn, age, and medications by participant

Physiological recordings

Over 30 days, we continuously collected the following physiological measurements passively across the waking hours of individuals using the Empatica E4 wristband, which collects continuous physiological indices, of which the present pilot will use indices common within literature to detect onset of disorders outside EDs (HR, EDA, and PST). Immediately after the screening for eligibility, participants were instructed by a researcher about how to use the wearable and were also given contact information in case they had questions or problems. At this time, participants' instructions included how to put the sensor band on in the morning when waking, to take off the sensor band when showering or swimming, how to endorse an ED behavior, how to endorse exercise and eating, and how to charge the E4 during sleeping hours. Empatica sensor-technology is used in the health sciences to identify physiological patterns associated with illness behaviors (Bidwell, Khuwatsamrit, Askew, Ehrenberg, & Helmers, Reference Bidwell, Khuwatsamrit, Askew, Ehrenberg and Helmers2015).

ED behaviors

ED behaviors were assessed broadly, and as such, could reflect a variety of specific behaviors, including self-induced vomiting, laxative or diuretic use, binge eating, restricting food intake, and excessive exercise. During clinical screening, and after meeting eligibility, eating disorder behavior examples and definitions were given to participants, and participants were asked if they had any questions (see online Supplementary Table S1). Participants endorsed engaging in an ED behavior by tapping twice on the Empatica E4 wristband button (88 episodes across participants across the 30 days). At the end of each day wearing the sensor band, participants were asked whether or not they tagged all ED behaviors. On this evening log, across participants, across the 30 days participants reported missing a collective of only five ED behaviors.

Data analytic plan

Preprocessing

Raw physiological data were visually checked for stability readings, and then archived into a postgreSQL relational database through Python's sqlalchemy library (Bayer, Reference Bayer, Brown and Wilson2012). Overall, participants wore the band from 14 to all 30 full days (M = 21.83; s.d. = 6.24), on average per day for 12 hours, 29 minutes, and 7 seconds, and participants endorsed 4 to 42 ED behaviors (M = 14.67; s.d. = 13.82) (see Table 2).

We explored if we would be able to detect the pre-onset of ED behaviors - through HR, EDA, and PST physiological indices - to allow enough time for future intervention pre-onset. Therefore, we queried each participant's physiological data from 20 min prior to a behavioral episode tag and extracted them into ‘windows’ (time periods of analysis), similarly to detecting other types of behaviors (e.g. Welch et al., Reference Welch, Pennington, Vanaparthy, Do, Narayanan, Popa and Kuravackel2023). For each window, to best line up the different signals which were at different hertz (Hz; EDA and PST at 4 Hz, and HR at 1 Hz), we resampled each physiological signal to 4 Hz and used Python to timestamp-match all the signals. As each raw signal is exported independently from the wristband, each discrete data point was provided a Coordinated Universal Time (UTC) timestamp based on the initial UNIX timestamp provided designating the beginning of the recording. The UTC timestamps were then used to align signals to each other during resampling. Next, we derived features from each window (e.g. mean, median, minimum, maximum, standard deviation, root mean square [RMS], mean absolute deviation [MAD], mean absolute value [MAV], 25th percentile, 75th percentile) for each signal for use in our ML algorithms. Based on the three raw sensor readings (HR, EDA, and PST) and 10 derived features, we used a 30-dimensional feature vector to represent each window in ML classification analysis.

Model construction

Following synthesis of the 30 features per window for each individual, the data were separated into two classes: baseline windows (class 0) and pre-onset ED behavior windows (class 1). Once the data were separated into ~100 baseline and ~100 behavior feature vectors through randomized subsampling, we applied a logistic regression classifier (LRC) to explore which models differentiated between vectors provided for the two classes. Models were encoded and tested using Python (Van Rossum & Drake, Reference Van Rossum and Drake2009). To ensure that sufficient training data was available for each model (~90 samples), as well as to minimize overfitting 10-fold cross-validation was utilized. This approach ensures that each model used to classify a testing data sample does not use the same sample during training. Average performance measures are provided across individual cross-validation performances. We built idiographic (N = 1) LRC ML trained models to identify personalized onset of episodes (~600 windows) v. baseline (~571 windows) physiology (HR, EDA, and PST). Accuracy classified cases (True Positives (TP) + True Negatives (TN)/Total), specificity classified behavioral episodes (TN/(TN + False Positives (FP))), and sensitivity classified baseline physiological classification (TP/(TP + False Negatives (FN))). Acceptable classification performance was defined as >70% accuracy of classified cases, sensitivity to classify baseline physiological classification, and specificity to classify behavioral episodes (Swets & Picket, Reference Swets and Picket1982).

Results

Table 3 provides idiographic detective performance. Using physiological data, LRC classified 91% of episodes accurately (Range = 0.84–0.99). Specificity estimates (e.g. classifying behavioral episodes) averaged 92% accuracy (Range = 0.82–1.00). Sensitivity (e.g. baseline physiology classification) averaged 90% (Range = 0.84–0.99).

Table 3. Accuracy, specificity, and sensitivity of idiographic LRC models (N = 1)

Discussion

Lack of treatment response in EDs and support in everyday life contributes to EDs being among the deadliest mental health disorders, second to opioid deaths (Berends, Boonstra, & Van Elburg, Reference Berends, Boonstra and Van Elburg2018). This reality indicates the urgent need for new treatment support methods available for use in everyday life. Physiological recordings from wearables have offered clinical utility in forecasting other clinical phenomena, which has led to providing preventative support for problematic behaviors. For example, via wearable or mobile phone, individuals and caregivers are notified of the risk of behavioral episodes, such as the pending onset of epileptic seizures or outbursts by persons with autism prior to onset so that steps may be taken to intervene (e.g. Clifton et al., Reference Clifton, Clifton, Pimentel, Watkinson and Tarassenko2014; Regalia et al., Reference Regalia, Onorati, Lai, Caborni and Picard2019). The current study expands upon previous research examining physiological correlates of ED behaviors in laboratory settings (see Presseller et al., Reference Presseller, Patarinski, Fan, Lampe and Juarascio2022 for review) by investigating these phenomena in naturalistic settings idiographically using ML. Our pilot findings suggest the ability for ML to detect the pre-onset of ED behaviors in a personalized manner with passive wearable data collection in individuals' everyday life 20 minutes pre-onset, similar to detection of pre-onset of other mental health problems, such as depressive symptoms, anxiety and avoidance symptoms, and pre-onset of behavioral outbursts (Alban et al., Reference Alban, Ayesh, Alhaddad, Khalid Al-Ali, So, Connor and Cabibihan2021; Goodwin et al., Reference Goodwin, Mazefsky, Ioannidis, Erdogmus and Siegel2019; Jacobson & Bhattacharya, Reference Jacobson and Bhattacharya2022; Welch et al., Reference Welch, Warning, Narayanan, Nethala, Do, Vanaparthy and Daisey2022; Welch et al., Reference Welch, Pennington, Vanaparthy, Do, Narayanan, Popa and Kuravackel2023; Zarate et al., Reference Zarate, Stavropoulos, Ball, de Sena Collier and Jacobson2022). All idiographic model ML algorithm abilities (accuracy, specificity, and sensitivity) demonstrated well above the 70% acceptable performance for each individual (Swets & Picket, Reference Swets and Picket1982; ranging from 84–99% accuracy, 82–100% specificity, and 84–99% sensitivity), which points to the strength of this method despite the high heterogeneity of symptoms and behavior frequencies, even within the same ED diagnosis. These results demonstrate the ML models' successful detective abilities to provide personalized ED behavior detection within individuals in everyday life despite the heterogeneity of EDs, a range of ED diagnostic types (i.e., BN, AN, AAN, ABN), variation in ED behavior frequency, and other individual differences. As such, wearable sensors may represent a method by which individuals can receive personalized support to their wearable or mobile phone within their environment to provide a warning of risk with a digital therapeutic intervention 20 min pre-onset of ED behavioral episodes, to give enough time to prevent ED behaviors, such as purging, that lead to relapse.

We had a small sample size, and as our goal was to develop idiographic detection, it does not necessitate a large sample. It has been posited that N = 5 or greater is sufficient for establishing direct replication in single-case designs (e.g. Barlow and Hersen, Reference Barlow and Hersen1973; Hensen and Barlow, Reference Hensen and Barlow1984; Kazdin, Reference Kazdin2011). Although there was some diversity within our pilot regarding racial and ethnic background, gender and sexual identity, and various EDs, more diverse samples should be included in future studies for generalizability. Additionally, participants endorsing purging behaviors may have indicated other active (e.g., exercise) and passive (e.g., body checking) ED behaviors, and though the momentary endorsement of ED behaviors were cross-checked with end-of-day logs, there is no way to control for accuracy of annotations. Future research should assess the detective abilities of physiological recordings toward behaviors not explicitly examined in the present study, investigate if some physiological indicators may better classify ED behaviors than others. With a larger sample, we may explore (1) population models and (2) how we may incorporate population-level data into informing idiographic patterns. Overall, results from the current research suggest that ML algorithms using physiological recordings can detect ED behaviors. Because these methods have the potential to identify instances of risk for maladaptive onset of ED behaviors, physiological recordings from wearables may be integrated into timely digital interventions in the future, such as just-in-time adaptive interventions (JITAIs) to provide personalized interventions at the moments most needed (e.g., delivered via smartphones; Juarascio et al., Reference Juarascio, Parker, Lagacey and Godfrey2018). For instance, a detection of the onset of risk of an ED behavior may trigger a JITAI to disrupt the behavior and intervene in a timely manner. Next steps may pinpoint the most essential indicators to best classify the pre-onset of ED behaviors, as well as other types of data which may compliment and strengthen these efforts.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/S003329172300288X.

Acknowledgements

Thank you to all our participants in the Predicting Recovery Study that made this research possible, and for our EAT Lab team and collaborators.

Funding statement

This work is funded by grant R15 MH121445 from the National Institute of Mental Health, National Institutes of Health (NIH). CRN is also funded by grant P20GM103436-20 (KY-INBRE) from the National Institute of General Medical Sciences. CEC is funded by the National Science Foundation (NSF) Graduate Research Fellowship Program under grant No. 2021320143. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the NIH or NSF.

Competing interest

None.

Ethical standards

The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.

References

Alban, A. Q., Ayesh, M., Alhaddad, A. Y., Khalid Al-Ali, A., So, W. C., Connor, O., & Cabibihan, J. J. (2021). Detection of challenging behaviours of children with autism using wearable sensors during interactions with social robots. 2021 30th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2021. https://doi.org/10.1109/RO-MAN50785.2021.9515459CrossRefGoogle Scholar
Arcelus, J., Mitchell, A. J., Wales, J., & Nielsen, S. (2011). Mortality rates in patients with anorexia nervosa and other eating disorders: A meta-analysis of 36 studies. Archives of General Psychiatry, 68(7), 724731. https://doi.org/10.1001/archgenpsychiatry.2011.74CrossRefGoogle ScholarPubMed
Barlow, D. H., & Hersen, M. (1973). Single-case experimental designs. Uses in applied clinical research. Archives of General Psychiatry, 29(3), 319325.CrossRefGoogle ScholarPubMed
Bayer, M. (2012). SQLAlchemy. In Brown, A. & Wilson, G. (Eds.), The architecture of open source applications volume II: Structure, scale, and a Few more fearless hacks. aosabook.org. Retrieved from “http://aosabook.org/en/sqlalchemy.htmlGoogle Scholar
Berends, T., Boonstra, N., & Van Elburg, A. (2018). Relapse in anorexia nervosa: A systematic review and meta-analysis. Current Opinion in Psychiatry, 31(6), 445455. https://doi.or/10.1097/YCO.0000000000000453CrossRefGoogle ScholarPubMed
Bidwell, J., Khuwatsamrit, T., Askew, B., Ehrenberg, J. A., & Helmers, S. (2015). Seizure reporting technologies for epilepsy treatment: A review of clinical information needs and supporting technologies. Seizure, 32, 109117. https://doi.org/10.1016/j.seizure.2015.09.006CrossRefGoogle ScholarPubMed
Bulik, C. M., Berkman, N. D., Brownley, K. A., Sedway, J. A., & Lohr, K. N. (2007). Anorexia nervosa treatment: A systematic review of randomized controlled trials. International Journal of Eating Disorders, 40(4), 310320. https://doi.org/10.1002/eat.20367CrossRefGoogle ScholarPubMed
Christian, C., Cash, E., Cohen, D. A., Trombley, C. M., & Levinson, C. A. (2023). Electrodermal activity and heart rate variability during exposure fear scripts predict trait-level and momentary social anxiety and eating-disorder symptoms in an analogue sample. Clinical Psychological Science, 11(1), 134148. https://doi.org/10.1177/21677026221083284CrossRefGoogle Scholar
Clifton, L., Clifton, D. A., Pimentel, M. A. F., Watkinson, P. J., & Tarassenko, L. (2014). Predictive monitoring of mobile patients by combining clinical observations with data from wearable sensors. IEEE Journal of Biomedical and Health Informatics, 18(3), 722730. https://doi.org/10.1109/JBHI.2013.2293059CrossRefGoogle ScholarPubMed
Deloitte Access Economics. (2020). The social and economic cost of eating disorders in the United States of America: A report for the strategic training initiative for the prevention of eating disorders and the academy for eating disorders. Deloitte Access Economics. Retrieved from https://doi.org/https://www.hsph.harvard.edu/striped/reporteconomic-costs-of-eating-disorders/Google Scholar
First, M. B., Williams, J. B. W., Karg, R. S., & Spitzer, R. L. (2015). Structured clinical interview for DSM-5 – Research version (SCID-5 for DSM-5, Research version; SCID-5-RV). Arlington: American Psychiatric Association.Google Scholar
Goodwin, M. S., Mazefsky, C. A., Ioannidis, S., Erdogmus, D., & Siegel, M. (2019). Predicting aggression to others in youth with autism using a wearable biosensor. Autism Research, 12(8), 12861296. https://doi.org/10.1002/aur.2151CrossRefGoogle ScholarPubMed
Hasselhorn, K., Ottenstein, C., & Lischetzke, T. (2022). The effects of assessment intensity on participant burden, compliance, within-person variance, and within-person relationships in ambulatory assessment. Behavior Research Methods, 54(4), 15411558. https://doi.org/10.3758/s13428-021-01683-6CrossRefGoogle ScholarPubMed
Hensen, M., & Barlow, D. H. (1984). Single case experimental designs. Pergamon, New York: GEN.Google Scholar
Jacobson, N. C., & Bhattacharya, S. (2022). Digital biomarkers of anxiety disorder symptom changes: Personalized deep learning models using smartphone sensors accurately predict anxiety symptoms from ecological momentary assessments. Behaviour Research and Therapy, 149, 104013. https://doi.org/10.1016/j.brat.2021.104013CrossRefGoogle ScholarPubMed
Juarascio, A. S., Parker, M. N., Lagacey, M. A., & Godfrey, K. M. (2018). Just-in-time adaptive interventions: A novel approach for enhancing skill utilization and acquisition in cognitive behavioral therapy for eating disorders. International Journal of Eating Disorders, 51(8), 826830. https://doi.org/10.1002/eat.22924CrossRefGoogle Scholar
Kazdin, A. E. (2011). Single-case research designs: Methods for clinical and applied settings (2nd ed.). New York, NY: Oxford University Press.Google Scholar
Keel, P. K., & Mitchell, J. E. (1997). Outcome in bulimia nervosa. In American Journal of Psychiatry, 154(3), 312321). https://doi.org/10.1176/ajp.154.3.313Google ScholarPubMed
Krantz, M. J., Blalock, D. V., Tanganyika, K., Farasat, M., McBride, J., & Mehler, P. S. (2020). Is QTc-interval prolongation an inherent feature of eating disorders? A cohort study. The American Journal of Medicine, 133(9), 10881094. https://doi.org/10.1016/j.amjmed.2020.02.015CrossRefGoogle ScholarPubMed
Lavender, J. M., de Young, K. P., Wonderlich, S. A., Crosby, R. D., Engel, S. G., Mitchell, J. E., … le Grange, D. (2013). Daily patterns of anxiety in anorexia nervosa: Associations with eating disorder behaviors in the natural environment. Journal of Abnormal Psychology, 122(3), 672683. https://doi.org/10.1037/A0031823CrossRefGoogle ScholarPubMed
Levinson, C. A., Christian, C., Shankar-Ram, S., Brosof, L. C., & Williams, B. (2019). Sensor technology implementation for research, treatment, and assessment of eating disorders. International Journal of Eating Disorders, 52(10), 11761180. https://doi.org/10.1002/eat.23120CrossRefGoogle ScholarPubMed
Levinson, C. A., Hunt, R. A., Christian, C., Williams, B. M., Keshishian, A. C., Vanzhula, I. A., … Ralph-Nearman, C. (2022). Longitudinal group and individual networks of eating disorder symptoms in individuals diagnosed with an eating disorder. Journal of Psychopathology and Clinical Science, 131(1), 58. https://doi.org/10.1037/abn0000727CrossRefGoogle ScholarPubMed
McCarthy, C., Pradhan, N., Redpath, C., & Adler, A. (2016, May). Validation of the Empatica E4 wristband. In 2016 IEEE EMBS international student conference (ISC) (pp. 1-4). IEEE.CrossRefGoogle Scholar
Northrup, J. B., Goodwin, M. S., Peura, C. B., Chen, Q., Taylor, B. J., Siegel, M. S., & Mazefsky, C. A. (2022). Mapping the time course of overt emotion dysregulation, self-injurious behavior, and aggression in psychiatrically hospitalized autistic youth: A naturalistic study. Autism Research, 15(10), 18551867. https://doi.org/10.1002/aur.2773CrossRefGoogle ScholarPubMed
Ortega-Roldan, B., Rodríguez-Ruiz, S., Perakakis, P., Fernandez-Santaella, M. C., & Vila, J. (2014). The emotional and attentional impact of exposure to one's own body in bulimia nervosa: A Physiological view. PloS One, 9(7), e102595. https://doi.org/10.1371/journal.pone.0102595CrossRefGoogle ScholarPubMed
Papežová, H., Yamamotova, A., & Uher, R. (2005). Elevated pain threshold in eating disorders: Physiological and psychological factors. Journal of Psychiatric Research, 39(4), 431438. https://doi.org/10.1016/j.jpsychires.2004.10.006CrossRefGoogle ScholarPubMed
Park, J., Jeong, H., Park, J., & Lee, B. C. (2019). Relationships between cognitive workload and physiological response under reliable and unreliable automation. In Advances in Human Factors and Systems Interaction: Proceedings of the AHFE 2018 International Conference on Human Factors and Systems Interaction, July 21-25, 2018, Loews Sapphire Falls Resort at Universal Studios, Orlando, Florida, USA 9 (pp. 3-8). Springer International Publishing.Google Scholar
Presseller, E. K., Parker, M. N., Lin, M., Weimer, J., & Juarascio, A. S. (2020). The application of continuous glucose monitoring technology to eating disorders research: An idea worth researching. International Journal of Eating Disorders, 53(12), 19011905. https://doi.org/10.1002/eat.23404CrossRefGoogle ScholarPubMed
Presseller, E. K., Patarinski, A. G. G., Fan, S. C., Lampe, E. W., & Juarascio, A. S. (2022). Sensor technology in eating disorders research: A systematic review. International Journal of Eating Disorders, 55(5), 573624. https://doi.org/10.1002/EAT.23715CrossRefGoogle ScholarPubMed
Ragot, M., Martin, N., Em, S., Pallamin, N., & Diverrez, J. M. (2018). Emotion recognition using physiological signals: laboratory vs. wearable sensors. In Advances in Human Factors in Wearable Technologies and Game Design: Proceedings of the AHFE 2017 International Conference on Advances in Human Factors and Wearable Technologies, July 17-21, 2017, The Westin Bonaventure Hotel, Los Angeles, California, USA 8 (pp. 15-22). Springer International Publishing.Google Scholar
Ralph-Nearman, C., Osborn, K. D., Chang, R. S., & Barber, K. E.. (2023, under review). Wearable physiological indices related to eating disorders: A systematic and methodological reviewCrossRefGoogle Scholar
Ranzenhofer, L. M., Engel, S. G., Crosby, R. D., Haigney, M., Anderson, M., McCaffery, J. M., & Tanofsky-Kraff, M. (2016). Real-time assessment of heart rate variability and loss of control eating in adolescent girls: A pilot study. The International Journal of Eating Disorders, 49(2), 197201. https://doi.org/10.1002/eat.22464CrossRefGoogle ScholarPubMed
Ravindran, K. K., Della Monica, C., Atzori, G., Lambert, D., Revell, V., & Dijk, D. J. (2022, July). Evaluating the Empatica E4 Derived Heart Rate and Heart Rate Variability Measures in Older Men and Women. In 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) (pp. 3370-3373). IEEE.CrossRefGoogle Scholar
Regalia, G., Onorati, F., Lai, M., Caborni, C., & Picard, R. W. (2019). Multimodal wrist-worn devices for seizure detection and advancing research: Focus on the Empatica wristbands. Epilepsy Research, 153, 7982. https://doi.org/10.1016/j.eplepsyres.2019.02.007CrossRefGoogle ScholarPubMed
Schaefer, L. M., Engel, S. G., & Wonderlich, S. A. (2020). Ecological momentary assessment in eating disorders research: Recent findings and promising new directions. Current Opinion in Psychiatry, 33(6), 528533. https://doi.org/10.1097/YCO.0000000000000639CrossRefGoogle ScholarPubMed
Schuurmans, A. A., de Looff, P., Nijhof, K. S., Rosada, C., Scholte, R. H., Popma, A., & Otten, R. (2020). Validity of the Empatica E4 wristband to measure heart rate variability (HRV) parameters: A comparison to electrocardiography (ECG). Journal of Medical Systems, 44, 111. https://doi.org/10.1007/s10916-020-01648-wCrossRefGoogle ScholarPubMed
Smith, K. E., & Juarascio, A. (2019). From ecological momentary assessment (EMA) to ecological momentary intervention (EMI): Past and future directions for ambulatory assessment and interventions in eating disorders. Current Psychiatry Reports, 21(7), 53. https://doi.org/10.1007/s11920-019-1046-8CrossRefGoogle ScholarPubMed
Smith, K. E., Mason, T. B., Juarascio, A., Schaefer, L. M., Crosby, R. D., Engel, S. G., & Wonderlich, S. A. (2019). Moving beyond self-report data collection in the natural environment: A review of the past and future directions for ambulatory assessment in eating disorders. International Journal of Eating Disorders, 52(10), 11571175. https://doi.org/10.1002/EAT.23124CrossRefGoogle ScholarPubMed
Steinhausen, H. C. (2009). Outcome of eating disorders. Child and Adolescent Psychiatric Clinics of North America, 18(1), 225242. https://doi.org/10.1016/j.chc.2008.07.013CrossRefGoogle ScholarPubMed
Steinhausen, H. C., & Weber, S. (2009). The outcome of bulimia nervosa: Findings from one-quarter century of research. American Journal of Psychiatry 166(12), 13311341. https://doi.org/10.1176/appi.ajp.2009.09040582CrossRefGoogle ScholarPubMed
Stice, E., Fisher, M., & Martinez, E. (2004). Eating disorder diagnostic scale: Additional evidence of reliability and validity. Psychological Assessment, 16(1), 6071. https://doi.org/10.1037/1040-3590.16.1.60CrossRefGoogle ScholarPubMed
Stice, E., Telch, C. F., & Rizvi, S. L. (2000). Development and validation of the Eating Disorder Diagnostic Scale: A brief self-report measure of anorexia, bulimia, and binge-eating disorder. Psychological Assessment, 12(2), 123131. https://doi.org/10.1037//1040-3590.12.2.123CrossRefGoogle Scholar
Stone, A. A., & Shiffman, S. (1994). Ecological momentary assessment (EMA) in behavioral medicine. Annals of Behavioral Medicine, 16(3), 199202. https://doi.org/10.1093/abm/16.3.199CrossRefGoogle Scholar
Swets, J., & Picket, R. (1982). Evaluation of diagnostic systems: Methods from signal detection theory. Academic Press: Cambridge, Massachusetts.Google Scholar
Thompson, D. A., Berg, K. M., & Shatford, L. A. (1987). The heterogeneity of bulimic symptomatology: Cognitive and behavioral dimensions. International Journal of Eating Disorders, 6(2), 215234. https://doi.org/10.1002/1098-108X(198703)6:2<215::AID-EAT2260060206>3.0.CO;2-J3.0.CO;2-J>CrossRefGoogle Scholar
van den Berg, E., Houtzager, L., de Vos, J., Daemen, I., Katsaragaki, G., Karyotaki, E., & …Dekker, J. (2019). Meta-analysis on the efficacy of psychological treatments for anorexia nervosa. In European Eating Disorders Review, 27(4), 331351. https://doi.org/10.1002/erv.2683CrossRefGoogle ScholarPubMed
van Lier, H. G., Pieterse, M. E., Garde, A., Postel, M. G., de Haan, H. A., Vollenbroek-Hutten, M. M., ... & Noordzij, M. L. (2020). A standardized validity assessment protocol for physiological signals from wearable technology: Methodological underpinnings and an application to the E4 biosensor. Behavior Research Methods, 52(2), 607629. https://doi.org/10.3758/s13428-019-01263-9CrossRefGoogle Scholar
Van Rossum, G., & Drake, F. L. (2009). Python 3 reference manual. Scotts Valley, CA: CreateSpace.Google Scholar
Walsh, B. T., Xu, T., Wang, Y., Attia, E., & Kaplan, A. S. (2021). Time course of relapse following acute treatment for anorexia nervosa. The American Journal of Psychiatry, 178(9), 848853. https://doi.org/10.1176/appi.ajp.2021.21010026CrossRefGoogle ScholarPubMed
Wang, S. B. (2021). Machine learning to advance the prediction, prevention and treatment of eating disorders. European Eating Disorders Review: the Journal of the Eating Disorders Association, 29(5), 683691. https://doi.org/10.1002/erv.2850CrossRefGoogle ScholarPubMed
Welch, K. C., Pennington, R., Vanaparthy, S., Do, H. M., Narayanan, R., Popa, D., … Kuravackel, G. (2023). Using physiological signals and machine learning algorithms to measure attentiveness during robot-assisted social skills intervention: A case study of two children with autism spectrum disorder. IEEE Instrumentation & Measurement Magazine, 26(3), 3945.CrossRefGoogle Scholar
Welch, K. C., Warning, N., Narayanan, R., Nethala, P., Do, H., Vanaparthy, S., … Daisey, S. (2022). First impressions of a NAO robot: evaluating emotions through E4 data and questionnaires. 2022 Conference of the International Society for Research on Emotion, ISRE 2022.Google Scholar
Wright, A. G. C., & Zimmermann, J. (2019). Applied ambulatory assessment: Integrating idiographic and nomothetic principles of measurement. Psychological Assessment, 31(12), 1467. https://doi.org/10.1037/pas0000685CrossRefGoogle ScholarPubMed
Zarate, D., Stavropoulos, V., Ball, M., de Sena Collier, G., & Jacobson, N. C. (2022). Exploring the digital footprint of depression: A PRISMA systematic literature review of the empirical evidence. BMC Psychiatry 22(1), 124. https://doi.org/10.1186/S12888-022-04013-YGoogle ScholarPubMed
Figure 0

Table 1. Sociodemographic characteristics of participants

Figure 1

Table 2. Descriptives of frequency of ED behaviors, days and time band was worn, age, and medications by participant

Figure 2

Table 3. Accuracy, specificity, and sensitivity of idiographic LRC models (N = 1)

Supplementary material: File

Ralph-Nearman et al. supplementary material

Ralph-Nearman et al. supplementary material
Download Ralph-Nearman et al. supplementary material(File)
File 16.3 KB