Murad Ali Khan, Jong-Hyun Jang, Naeem Iqbal, Harun Jamil, Syed Shehryar Ali Naqvi, Salabat Khan, Jae-Chul Kim, Do-Hyeun Kim
{"title":"用混合异常检测模型增强患者康复预测:基于密度的聚类和四分位数范围方法","authors":"Murad Ali Khan, Jong-Hyun Jang, Naeem Iqbal, Harun Jamil, Syed Shehryar Ali Naqvi, Salabat Khan, Jae-Chul Kim, Do-Hyeun Kim","doi":"10.1049/cit2.70000","DOIUrl":null,"url":null,"abstract":"<p>In recent years, there has been a concerted effort to improve anomaly detection techniques, particularly in the context of high-dimensional, distributed clinical data. Analysing patient data within clinical settings reveals a pronounced focus on refining diagnostic accuracy, personalising treatment plans, and optimising resource allocation to enhance clinical outcomes. Nonetheless, this domain faces unique challenges, such as irregular data collection, inconsistent data quality, and patient-specific structural variations. This paper proposed a novel hybrid approach that integrates heuristic and stochastic methods for anomaly detection in patient clinical data to address these challenges. The strategy combines HPO-based optimal Density-Based Spatial Clustering of Applications with Noise for clustering patient exercise data, facilitating efficient anomaly identification. Subsequently, a stochastic method based on the Interquartile Range filters unreliable data points, ensuring that medical tools and professionals receive only the most pertinent and accurate information. The primary objective of this study is to equip healthcare professionals and researchers with a robust tool for managing extensive, high-dimensional clinical datasets, enabling effective isolation and removal of aberrant data points. Furthermore, a sophisticated regression model has been developed using Automated Machine Learning (AutoML) to assess the impact of the ensemble abnormal pattern detection approach. Various statistical error estimation techniques validate the efficacy of the hybrid approach alongside AutoML. Experimental results show that implementing this innovative hybrid model on patient rehabilitation data leads to a notable enhancement in AutoML performance, with an average improvement of 0.041 in the <span></span><math>\n <semantics>\n <mrow>\n <msup>\n <mi>R</mi>\n <mn>2</mn>\n </msup>\n </mrow>\n <annotation> ${R}^{2}$</annotation>\n </semantics></math> score, surpassing the effectiveness of traditional regression models.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 4","pages":"983-1006"},"PeriodicalIF":7.3000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70000","citationCount":"0","resultStr":"{\"title\":\"Enhancing patient rehabilitation predictions with a hybrid anomaly detection model: Density-based clustering and interquartile range methods\",\"authors\":\"Murad Ali Khan, Jong-Hyun Jang, Naeem Iqbal, Harun Jamil, Syed Shehryar Ali Naqvi, Salabat Khan, Jae-Chul Kim, Do-Hyeun Kim\",\"doi\":\"10.1049/cit2.70000\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In recent years, there has been a concerted effort to improve anomaly detection techniques, particularly in the context of high-dimensional, distributed clinical data. Analysing patient data within clinical settings reveals a pronounced focus on refining diagnostic accuracy, personalising treatment plans, and optimising resource allocation to enhance clinical outcomes. Nonetheless, this domain faces unique challenges, such as irregular data collection, inconsistent data quality, and patient-specific structural variations. This paper proposed a novel hybrid approach that integrates heuristic and stochastic methods for anomaly detection in patient clinical data to address these challenges. The strategy combines HPO-based optimal Density-Based Spatial Clustering of Applications with Noise for clustering patient exercise data, facilitating efficient anomaly identification. Subsequently, a stochastic method based on the Interquartile Range filters unreliable data points, ensuring that medical tools and professionals receive only the most pertinent and accurate information. The primary objective of this study is to equip healthcare professionals and researchers with a robust tool for managing extensive, high-dimensional clinical datasets, enabling effective isolation and removal of aberrant data points. Furthermore, a sophisticated regression model has been developed using Automated Machine Learning (AutoML) to assess the impact of the ensemble abnormal pattern detection approach. Various statistical error estimation techniques validate the efficacy of the hybrid approach alongside AutoML. Experimental results show that implementing this innovative hybrid model on patient rehabilitation data leads to a notable enhancement in AutoML performance, with an average improvement of 0.041 in the <span></span><math>\\n <semantics>\\n <mrow>\\n <msup>\\n <mi>R</mi>\\n <mn>2</mn>\\n </msup>\\n </mrow>\\n <annotation> ${R}^{2}$</annotation>\\n </semantics></math> score, surpassing the effectiveness of traditional regression models.</p>\",\"PeriodicalId\":46211,\"journal\":{\"name\":\"CAAI Transactions on Intelligence Technology\",\"volume\":\"10 4\",\"pages\":\"983-1006\"},\"PeriodicalIF\":7.3000,\"publicationDate\":\"2025-03-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70000\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CAAI Transactions on Intelligence Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/cit2.70000\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CAAI Transactions on Intelligence Technology","FirstCategoryId":"94","ListUrlMain":"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/cit2.70000","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Enhancing patient rehabilitation predictions with a hybrid anomaly detection model: Density-based clustering and interquartile range methods
In recent years, there has been a concerted effort to improve anomaly detection techniques, particularly in the context of high-dimensional, distributed clinical data. Analysing patient data within clinical settings reveals a pronounced focus on refining diagnostic accuracy, personalising treatment plans, and optimising resource allocation to enhance clinical outcomes. Nonetheless, this domain faces unique challenges, such as irregular data collection, inconsistent data quality, and patient-specific structural variations. This paper proposed a novel hybrid approach that integrates heuristic and stochastic methods for anomaly detection in patient clinical data to address these challenges. The strategy combines HPO-based optimal Density-Based Spatial Clustering of Applications with Noise for clustering patient exercise data, facilitating efficient anomaly identification. Subsequently, a stochastic method based on the Interquartile Range filters unreliable data points, ensuring that medical tools and professionals receive only the most pertinent and accurate information. The primary objective of this study is to equip healthcare professionals and researchers with a robust tool for managing extensive, high-dimensional clinical datasets, enabling effective isolation and removal of aberrant data points. Furthermore, a sophisticated regression model has been developed using Automated Machine Learning (AutoML) to assess the impact of the ensemble abnormal pattern detection approach. Various statistical error estimation techniques validate the efficacy of the hybrid approach alongside AutoML. Experimental results show that implementing this innovative hybrid model on patient rehabilitation data leads to a notable enhancement in AutoML performance, with an average improvement of 0.041 in the score, surpassing the effectiveness of traditional regression models.
期刊介绍:
CAAI Transactions on Intelligence Technology is a leading venue for original research on the theoretical and experimental aspects of artificial intelligence technology. We are a fully open access journal co-published by the Institution of Engineering and Technology (IET) and the Chinese Association for Artificial Intelligence (CAAI) providing research which is openly accessible to read and share worldwide.