Óscar Escudero-Arnanz , Antonio G. Marques , Inmaculada Mora-Jiménez , Joaquín Álvarez-Rodríguez , Cristina Soguero-Ruiz
{"title":"Early detection of Multidrug Resistance using Multivariate Time Series analysis and interpretable patient-similarity representations","authors":"Óscar Escudero-Arnanz , Antonio G. Marques , Inmaculada Mora-Jiménez , Joaquín Álvarez-Rodríguez , Cristina Soguero-Ruiz","doi":"10.1016/j.cmpb.2025.108920","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objectives:</h3><div>Multidrug Resistance has been identified by the World Health Organization as a major global health threat. It leads to severe social and economic consequences, including extended hospital stays, increased healthcare costs, and higher mortality rates. In response to this challenge, this study proposes a novel interpretable Machine Learning (ML) approach for predicting MDR, developed with two primary objectives: accurate inference and enhanced explainability.</div></div><div><h3>Methods:</h3><div><em>For inference</em>, the proposed method is based on patient-to-patient similarity representations to predict MDR outcomes. Each patient is modeled as a Multivariate Time Series (MTS), capturing both clinical progression and interactions with similar patients. To quantify these relationships, we employ MTS-based similarity metrics, including feature engineering using descriptive statistics, Dynamic Time Warping, and the Time Cluster Kernel. These methods are used as inputs for MDR classification through Logistic Regression, Random Forest, and Support Vector Machines, with dimensionality reduction and kernel transformations applied to enhance model performance. <em>For explainability</em>, we employ graph-based methods to extract meaningful patterns from the data. Patient similarity networks are generated using the MTS-based similarity metrics mentioned above, while spectral clustering and t-SNE are applied to identify MDR-related subgroups, uncover clinically relevant patterns, and visualize high-risk clusters. These insights improve interpretability and support more informed decision-making in critical care settings.</div></div><div><h3>Results:</h3><div>We validate our architecture on real-world Electronic Health Records from the Intensive Care Unit (ICU) dataset at the University Hospital of Fuenlabrada, achieving a Receiver Operating Characteristic Area Under the Curve of 81%. Our framework surpasses ML and deep learning models on the same dataset by leveraging graph-based patient similarity. In addition, it offers a simple yet effective interpretability mechanism that facilitates the identification of key risk factors—such as prolonged antibiotic exposure, invasive procedures, co-infections, and extended ICU stays—and the discovery of clinically meaningful patient clusters. For transparency, all results and code are available at <span><span>https://github.com/oscarescuderoarnanz/DM4MTS</span><svg><path></path></svg></span>.</div></div><div><h3>Conclusions:</h3><div>This study demonstrates the effectiveness of patient similarity representations and graph-based methods for MDR prediction and interpretability. The approach enhances prediction, identifies key risk factors, and improves patient stratification, enabling early detection and targeted interventions, highlighting the potential of interpretable ML in critical care.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"270 ","pages":"Article 108920"},"PeriodicalIF":4.9000,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260725003372","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Background and Objectives:
Multidrug Resistance has been identified by the World Health Organization as a major global health threat. It leads to severe social and economic consequences, including extended hospital stays, increased healthcare costs, and higher mortality rates. In response to this challenge, this study proposes a novel interpretable Machine Learning (ML) approach for predicting MDR, developed with two primary objectives: accurate inference and enhanced explainability.
Methods:
For inference, the proposed method is based on patient-to-patient similarity representations to predict MDR outcomes. Each patient is modeled as a Multivariate Time Series (MTS), capturing both clinical progression and interactions with similar patients. To quantify these relationships, we employ MTS-based similarity metrics, including feature engineering using descriptive statistics, Dynamic Time Warping, and the Time Cluster Kernel. These methods are used as inputs for MDR classification through Logistic Regression, Random Forest, and Support Vector Machines, with dimensionality reduction and kernel transformations applied to enhance model performance. For explainability, we employ graph-based methods to extract meaningful patterns from the data. Patient similarity networks are generated using the MTS-based similarity metrics mentioned above, while spectral clustering and t-SNE are applied to identify MDR-related subgroups, uncover clinically relevant patterns, and visualize high-risk clusters. These insights improve interpretability and support more informed decision-making in critical care settings.
Results:
We validate our architecture on real-world Electronic Health Records from the Intensive Care Unit (ICU) dataset at the University Hospital of Fuenlabrada, achieving a Receiver Operating Characteristic Area Under the Curve of 81%. Our framework surpasses ML and deep learning models on the same dataset by leveraging graph-based patient similarity. In addition, it offers a simple yet effective interpretability mechanism that facilitates the identification of key risk factors—such as prolonged antibiotic exposure, invasive procedures, co-infections, and extended ICU stays—and the discovery of clinically meaningful patient clusters. For transparency, all results and code are available at https://github.com/oscarescuderoarnanz/DM4MTS.
Conclusions:
This study demonstrates the effectiveness of patient similarity representations and graph-based methods for MDR prediction and interpretability. The approach enhances prediction, identifies key risk factors, and improves patient stratification, enabling early detection and targeted interventions, highlighting the potential of interpretable ML in critical care.
期刊介绍:
To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine.
Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.