Early detection of Multidrug Resistance using Multivariate Time Series analysis and interpretable patient-similarity representations

IF 4.9 2区医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computer methods and programs in biomedicine Pub Date : 2025-07-12 DOI:10.1016/j.cmpb.2025.108920

Óscar Escudero-Arnanz , Antonio G. Marques , Inmaculada Mora-Jiménez , Joaquín Álvarez-Rodríguez , Cristina Soguero-Ruiz

{"title":"Early detection of Multidrug Resistance using Multivariate Time Series analysis and interpretable patient-similarity representations","authors":"Óscar Escudero-Arnanz , Antonio G. Marques , Inmaculada Mora-Jiménez , Joaquín Álvarez-Rodríguez , Cristina Soguero-Ruiz","doi":"10.1016/j.cmpb.2025.108920","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objectives:</h3><div>Multidrug Resistance has been identified by the World Health Organization as a major global health threat. It leads to severe social and economic consequences, including extended hospital stays, increased healthcare costs, and higher mortality rates. In response to this challenge, this study proposes a novel interpretable Machine Learning (ML) approach for predicting MDR, developed with two primary objectives: accurate inference and enhanced explainability.</div></div><div><h3>Methods:</h3><div><em>For inference</em>, the proposed method is based on patient-to-patient similarity representations to predict MDR outcomes. Each patient is modeled as a Multivariate Time Series (MTS), capturing both clinical progression and interactions with similar patients. To quantify these relationships, we employ MTS-based similarity metrics, including feature engineering using descriptive statistics, Dynamic Time Warping, and the Time Cluster Kernel. These methods are used as inputs for MDR classification through Logistic Regression, Random Forest, and Support Vector Machines, with dimensionality reduction and kernel transformations applied to enhance model performance. <em>For explainability</em>, we employ graph-based methods to extract meaningful patterns from the data. Patient similarity networks are generated using the MTS-based similarity metrics mentioned above, while spectral clustering and t-SNE are applied to identify MDR-related subgroups, uncover clinically relevant patterns, and visualize high-risk clusters. These insights improve interpretability and support more informed decision-making in critical care settings.</div></div><div><h3>Results:</h3><div>We validate our architecture on real-world Electronic Health Records from the Intensive Care Unit (ICU) dataset at the University Hospital of Fuenlabrada, achieving a Receiver Operating Characteristic Area Under the Curve of 81%. Our framework surpasses ML and deep learning models on the same dataset by leveraging graph-based patient similarity. In addition, it offers a simple yet effective interpretability mechanism that facilitates the identification of key risk factors—such as prolonged antibiotic exposure, invasive procedures, co-infections, and extended ICU stays—and the discovery of clinically meaningful patient clusters. For transparency, all results and code are available at <span><span>https://github.com/oscarescuderoarnanz/DM4MTS</span><svg><path></path></svg></span>.</div></div><div><h3>Conclusions:</h3><div>This study demonstrates the effectiveness of patient similarity representations and graph-based methods for MDR prediction and interpretability. The approach enhances prediction, identifies key risk factors, and improves patient stratification, enabling early detection and targeted interventions, highlighting the potential of interpretable ML in critical care.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"270 ","pages":"Article 108920"},"PeriodicalIF":4.9000,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260725003372","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Background and Objectives:

Multidrug Resistance has been identified by the World Health Organization as a major global health threat. It leads to severe social and economic consequences, including extended hospital stays, increased healthcare costs, and higher mortality rates. In response to this challenge, this study proposes a novel interpretable Machine Learning (ML) approach for predicting MDR, developed with two primary objectives: accurate inference and enhanced explainability.

Methods:

For inference, the proposed method is based on patient-to-patient similarity representations to predict MDR outcomes. Each patient is modeled as a Multivariate Time Series (MTS), capturing both clinical progression and interactions with similar patients. To quantify these relationships, we employ MTS-based similarity metrics, including feature engineering using descriptive statistics, Dynamic Time Warping, and the Time Cluster Kernel. These methods are used as inputs for MDR classification through Logistic Regression, Random Forest, and Support Vector Machines, with dimensionality reduction and kernel transformations applied to enhance model performance. For explainability, we employ graph-based methods to extract meaningful patterns from the data. Patient similarity networks are generated using the MTS-based similarity metrics mentioned above, while spectral clustering and t-SNE are applied to identify MDR-related subgroups, uncover clinically relevant patterns, and visualize high-risk clusters. These insights improve interpretability and support more informed decision-making in critical care settings.

Results:

We validate our architecture on real-world Electronic Health Records from the Intensive Care Unit (ICU) dataset at the University Hospital of Fuenlabrada, achieving a Receiver Operating Characteristic Area Under the Curve of 81%. Our framework surpasses ML and deep learning models on the same dataset by leveraging graph-based patient similarity. In addition, it offers a simple yet effective interpretability mechanism that facilitates the identification of key risk factors—such as prolonged antibiotic exposure, invasive procedures, co-infections, and extended ICU stays—and the discovery of clinically meaningful patient clusters. For transparency, all results and code are available at https://github.com/oscarescuderoarnanz/DM4MTS.

Conclusions:

This study demonstrates the effectiveness of patient similarity representations and graph-based methods for MDR prediction and interpretability. The approach enhances prediction, identifies key risk factors, and improves patient stratification, enabling early detection and targeted interventions, highlighting the potential of interpretable ML in critical care.

查看原文本刊更多论文

多药耐药的早期检测使用多元时间序列分析和可解释的患者相似性表征

背景和目标：多药耐药性已被世界卫生组织确定为一项重大的全球健康威胁。它会导致严重的社会和经济后果，包括延长住院时间、增加医疗保健费用和提高死亡率。为了应对这一挑战，本研究提出了一种新的可解释机器学习（ML）方法来预测MDR，该方法有两个主要目标：准确的推断和增强的可解释性。方法：在推理方面，提出的方法基于患者对患者的相似性表示来预测MDR结果。每个患者被建模为一个多变量时间序列（MTS），捕捉临床进展和与类似患者的相互作用。为了量化这些关系，我们采用了基于mts的相似性度量，包括使用描述性统计的特征工程、动态时间扭曲和时间聚类内核。这些方法通过逻辑回归、随机森林和支持向量机作为MDR分类的输入，并使用降维和核变换来提高模型性能。为了可解释性，我们采用基于图的方法从数据中提取有意义的模式。使用上述基于mts的相似性度量生成患者相似网络，而光谱聚类和t-SNE应用于识别耐多药相关亚组，揭示临床相关模式，并可视化高风险集群。这些见解提高了可解释性，并支持重症监护环境中更明智的决策。结果：我们在来自Fuenlabrada大学医院重症监护病房（ICU）数据集的真实电子健康记录上验证了我们的架构，实现了81%的曲线下接收者工作特征面积。我们的框架通过利用基于图的患者相似性，超越了同一数据集上的ML和深度学习模型。此外，它还提供了一种简单而有效的可解释性机制，有助于识别关键风险因素，如长期抗生素暴露、侵入性手术、合并感染和延长ICU住院时间，以及发现临床有意义的患者群。为了透明起见，所有结果和代码都可以在https://github.com/oscarescuderoarnanz/DM4MTS.Conclusions:This上获得，研究证明了患者相似性表示和基于图的方法对MDR预测和可解释性的有效性。该方法增强了预测，识别关键风险因素，并改善了患者分层，实现了早期发现和有针对性的干预，突出了可解释ML在重症监护中的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer methods and programs in biomedicine 工程技术-工程：生物医学

CiteScore

12.30

自引率

6.60%

发文量

601

审稿时长

135 days

期刊介绍： To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.