Published online by Cambridge University Press: 23 March 2020
Key lifestyle-environ risk factors are operative for depression, but it is unclear how risk factors cluster. Machine-learning (ML) algorithms exist that learn, extract, identify and map underlying patterns to identify groupings of depressed individuals without constraints. The aim of this research was to use a large epidemiological study to identify and characterise depression clusters through “Graphing lifestyle-environs using machine-learning methods” (GLUMM).
Two ML algorithms were implemented: unsupervised Self-organised mapping (SOM) to create GLUMM clusters and a supervised boosted regression algorithm to describe clusters. Ninety-six “lifestyle-environ” variables were used from the National health and nutrition examination study (2009–2010). Multivariate logistic regression validated clusters and controlled for possible sociodemographic confounders.
The SOM identified two GLUMM cluster solutions. These solutions contained one dominant depressed cluster (GLUMM5-1, GLUMM7-1). Equal proportions of members in each cluster rated as highly depressed (17%). Alcohol consumption and demographics validated clusters. Boosted regression identified GLUMM5-1 as more informative than GLUMM7-1. Members were more likely to: have problems sleeping; unhealthy eating; ≤ 2 years in their home; an old home; perceive themselves underweight; exposed to work fumes; experienced sex at ≤ 14 years; not perform moderate recreational activities. A positive relationship between GLUMM5-1 (OR: 7.50, P < 0.001) and GLUMM7-1 (OR: 7.88, P < 0.001) with depression was found, with significant interactions with those married/living with partner (P = 0.001).
Using ML based GLUMM to form ordered depressive clusters from multitudinous lifestyle-environ variables enabled a deeper exploration of the heterogeneous data to uncover better understandings into relationships between the complex mental health factors.
These authors contributed equally to this work.
Abbreviations: DIPIT, Data integration protocol in ten-steps, GLUMM, Graphing lifestyle-environs using machine-learning methods, GLUMM5-1, GLUMM solution 5 cluster 1, GLUMM5-2, GLUMM solution 5 cluster 2, GLUMM7-1, GLUMM solution 7 cluster 1, GLUMM7-3, GLUMM solution 7 cluster 3, GLUMM7-4, GLUMM solution 7 cluster 4, ML, Machine-learning, MART, Multiple additive regression trees, NCHS, National center for health statistics, NHANES, National health and nutrition examination survey, PHQ-9, Patient health questonnaire-9, SOMs, Self-organizing maps
Comments
No Comments have been published for this article.