PD33 Development And Validation Of A Machine Learning-Based Prediction Model For COVID-19 Diagnosis Using Patients’ Metabolomic Profile Data

Alexandre Cobre; Monica Surek; Dile Stremel; Karime Domingues; Fernanda Stumpf Tonin; Roberto Pontarolo

doi:10.1017/S0266462322002926

Introduction

We aimed to develop and validate machine learning (ML) -based algorithms to predict COVID-19 diagnosis as well as to identify new biomarkers associated with the disease.

Methods

Initially, 96 blood samples of patients diagnosed with COVID-19 (Thaizhou Hospital, China) were analyzed through liquid chromatography coupled to mass spectrometry. Samples of patients presenting other pneumonias or severe acute respiratory syndrome, but with negative RT-PCR for SARS-CoV-2, were used as positive controls. Samples from healthy volunteers were used as negative controls. The final database included around 1000 metabolites. Exploratory analyses for the development of ML-based models using principal component analysis (PCA) were performed. Leverage plot versus studentized residuals method was used to detect outliers. Three supervised ML-based models were developed: discriminant analysis by partial least squares (PLS-DA), artificial neural networks discriminant analysis (ANNDA) and k-nearest neighbors (KNN). Samples for the training (70%) and testing sets (30%) were randomly selected using the Kenrad Stone algorithm. Models’ performance was evaluated considering accuracy, sensitivity and specificity. Analyses were conducted in SOLO (Eigenvector-Research).

Results

The PCA model was able to distinguish the three classes of patients’ samples (positive for COVID-19, negative controls, positive controls) with an overall accumulated variance of 94.27 percent. The PLS-DA model presented the best performance (accuracy, sensitivity, and specificity of 93%, 98% and 88%, respectively). Increased levels of the biomarkers uridine (linked to glucose homeostasis, lipid, and amino acid metabolisms), 4-hydroxyphenylacetoylcarnitine (metabolite from the tyrosine metabolism; probably associated with anorexia) and ribothymidine (resulting from oral and fecal microbiota alterations) were significantly associated with COVID-19.

Conclusions

Three different and updated ML-based algorithms were developed to predict COVID-19 diagnosis; PLS-DA led to the most accurate results. High levels of some metabolites were found as potentially predictors of the disease. These biomarkers should be further evaluated as potential therapeutic targets in well-designed clinical trials. These ML-based models can help the early diagnosis of COVID-19 and guide the development of tailored interventions.

Article contents

PD33 Development And Validation Of A Machine Learning-Based Prediction Model For COVID-19 Diagnosis Using Patients’ Metabolomic Profile Data

Abstract

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Article contents

PD33 Development And Validation Of A Machine Learning-Based Prediction Model For COVID-19 Diagnosis Using Patients’ Metabolomic Profile Data

Abstract

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests