基于距离的无监督局部离群点检测:基于值分析的机器学习改进离群点检测

IF 1.6 4区 计算机科学 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC
Atul Kumar Gupta, Rahul Kumar, Jhankar Moolchandani, Vikas Thada, Mohd Asif Shah, Anoop Kumar Tiwari
{"title":"基于距离的无监督局部离群点检测:基于值分析的机器学习改进离群点检测","authors":"Atul Kumar Gupta,&nbsp;Rahul Kumar,&nbsp;Jhankar Moolchandani,&nbsp;Vikas Thada,&nbsp;Mohd Asif Shah,&nbsp;Anoop Kumar Tiwari","doi":"10.1049/cmu2.70060","DOIUrl":null,"url":null,"abstract":"<p>Machine learning faces challenges in detecting outliers, especially in high-dimensional datasets. Effective data quality is crucial for better results, and many algorithms identify outliers by analysing outlying aspects of data objects and objects within the dataset. The proposed Advanced Distance-Based Unsupervised Local Outlier Detection (DU-LOD) method improves this process by continuously evaluating and identifying outliers using unsupervised learning and distance-based calculations. DU-LOD identifies outliers by comparing differences between data objects and their neighbours, making it the first method to combine unsupervised local outlier detection with nearest cluster point identification. Experimental analysis through accuracy performance of 96.12%, detection rate performance of 41.89%, precision of 56.12%, and recall of 1.79% proves that our model performs best over the various parameters compared with other existing algorithms. Therefore, measures such as area under the ROC curve (AUC), precision and recall are more appropriate in such a scenario.</p>","PeriodicalId":55001,"journal":{"name":"IET Communications","volume":"19 1","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cmu2.70060","citationCount":"0","resultStr":"{\"title\":\"Distance-Based Unsupervised Local Outlier Detection: Based Values Analysis to Improve Outlier Detection Using Machine Learning\",\"authors\":\"Atul Kumar Gupta,&nbsp;Rahul Kumar,&nbsp;Jhankar Moolchandani,&nbsp;Vikas Thada,&nbsp;Mohd Asif Shah,&nbsp;Anoop Kumar Tiwari\",\"doi\":\"10.1049/cmu2.70060\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Machine learning faces challenges in detecting outliers, especially in high-dimensional datasets. Effective data quality is crucial for better results, and many algorithms identify outliers by analysing outlying aspects of data objects and objects within the dataset. The proposed Advanced Distance-Based Unsupervised Local Outlier Detection (DU-LOD) method improves this process by continuously evaluating and identifying outliers using unsupervised learning and distance-based calculations. DU-LOD identifies outliers by comparing differences between data objects and their neighbours, making it the first method to combine unsupervised local outlier detection with nearest cluster point identification. Experimental analysis through accuracy performance of 96.12%, detection rate performance of 41.89%, precision of 56.12%, and recall of 1.79% proves that our model performs best over the various parameters compared with other existing algorithms. Therefore, measures such as area under the ROC curve (AUC), precision and recall are more appropriate in such a scenario.</p>\",\"PeriodicalId\":55001,\"journal\":{\"name\":\"IET Communications\",\"volume\":\"19 1\",\"pages\":\"\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2025-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cmu2.70060\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IET Communications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1049/cmu2.70060\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Communications","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cmu2.70060","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

机器学习在检测异常值方面面临挑战,特别是在高维数据集中。有效的数据质量对于获得更好的结果至关重要,许多算法通过分析数据对象和数据集中对象的异常方面来识别异常值。提出的基于距离的高级无监督局部异常点检测(DU-LOD)方法通过使用无监督学习和基于距离的计算连续评估和识别异常点,改进了这一过程。DU-LOD通过比较数据对象与其邻居之间的差异来识别异常值,使其成为第一个将无监督局部异常点检测与最近聚类点识别相结合的方法。通过96.12%的准确率、41.89%的检出率、56.12%的准确率和1.79%的召回率等实验分析,证明了我们的模型在各参数上的性能都是现有算法中最好的。因此,ROC曲线下面积(area under ROC curve, AUC)、精确度(precision)和召回率(recall)等指标在这种情况下更为合适。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Distance-Based Unsupervised Local Outlier Detection: Based Values Analysis to Improve Outlier Detection Using Machine Learning

Distance-Based Unsupervised Local Outlier Detection: Based Values Analysis to Improve Outlier Detection Using Machine Learning

Machine learning faces challenges in detecting outliers, especially in high-dimensional datasets. Effective data quality is crucial for better results, and many algorithms identify outliers by analysing outlying aspects of data objects and objects within the dataset. The proposed Advanced Distance-Based Unsupervised Local Outlier Detection (DU-LOD) method improves this process by continuously evaluating and identifying outliers using unsupervised learning and distance-based calculations. DU-LOD identifies outliers by comparing differences between data objects and their neighbours, making it the first method to combine unsupervised local outlier detection with nearest cluster point identification. Experimental analysis through accuracy performance of 96.12%, detection rate performance of 41.89%, precision of 56.12%, and recall of 1.79% proves that our model performs best over the various parameters compared with other existing algorithms. Therefore, measures such as area under the ROC curve (AUC), precision and recall are more appropriate in such a scenario.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IET Communications
IET Communications 工程技术-工程:电子与电气
CiteScore
4.30
自引率
6.20%
发文量
220
审稿时长
5.9 months
期刊介绍: IET Communications covers the fundamental and generic research for a better understanding of communication technologies to harness the signals for better performing communication systems using various wired and/or wireless media. This Journal is particularly interested in research papers reporting novel solutions to the dominating problems of noise, interference, timing and errors for reduction systems deficiencies such as wasting scarce resources such as spectra, energy and bandwidth. Topics include, but are not limited to: Coding and Communication Theory; Modulation and Signal Design; Wired, Wireless and Optical Communication; Communication System Special Issues. Current Call for Papers: Cognitive and AI-enabled Wireless and Mobile - https://digital-library.theiet.org/files/IET_COM_CFP_CAWM.pdf UAV-Enabled Mobile Edge Computing - https://digital-library.theiet.org/files/IET_COM_CFP_UAV.pdf
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信