A Parametric and Non-Parametric Approach for High-Accurate Outlier Detection

IF 0.5 4区 计算机科学 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS
Mohamed Jaward Bah, Honghi Wang
{"title":"A Parametric and Non-Parametric Approach for High-Accurate Outlier Detection","authors":"Mohamed Jaward Bah, Honghi Wang","doi":"10.6688/JISE.202003_36(2).0018","DOIUrl":null,"url":null,"abstract":"Outlier detection is an essential problem that has been studied in a wide range of applications in diverse fields. One common approach to outlier detection is using statistical models, but these methods have inherent challenges and drawbacks. For instance, in providing optimal solutions that will enable the idea of detecting outliers more effectively with a high detection rate and in minimizing the computational cost. Many statistical techniques that have been proposed are classified into mainly parametric and non-parametric methods, and to the best of our knowledge, evaluating and deciphering the effects of these methods against each other remains to be an open research direction, and most of these statistical methods proposed earlier have not shown high outlier detection accuracy. In this paper, under the umbrella and generalization of statistical approach, we propose Gaussian Mixture Model for Outlier Detection (GMMOD) for the parametric approach and Kernel Density Estimation for Outlier Detection (KDEOD) algorithms for the non-parametric approach, for solving the problem of detecting outliers more effectively and in improving the outlier detection accuracy. The proposed methods are applied to real- world datasets, and our experimental results show that even though both techniques perform well, KDEOD shows favorable by a smaller margin in most cases when compared to GMMOD and both show improved performance over their similar comparative algorithms.","PeriodicalId":50177,"journal":{"name":"Journal of Information Science and Engineering","volume":"55 1","pages":"441-465"},"PeriodicalIF":0.5000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.6688/JISE.202003_36(2).0018","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Outlier detection is an essential problem that has been studied in a wide range of applications in diverse fields. One common approach to outlier detection is using statistical models, but these methods have inherent challenges and drawbacks. For instance, in providing optimal solutions that will enable the idea of detecting outliers more effectively with a high detection rate and in minimizing the computational cost. Many statistical techniques that have been proposed are classified into mainly parametric and non-parametric methods, and to the best of our knowledge, evaluating and deciphering the effects of these methods against each other remains to be an open research direction, and most of these statistical methods proposed earlier have not shown high outlier detection accuracy. In this paper, under the umbrella and generalization of statistical approach, we propose Gaussian Mixture Model for Outlier Detection (GMMOD) for the parametric approach and Kernel Density Estimation for Outlier Detection (KDEOD) algorithms for the non-parametric approach, for solving the problem of detecting outliers more effectively and in improving the outlier detection accuracy. The proposed methods are applied to real- world datasets, and our experimental results show that even though both techniques perform well, KDEOD shows favorable by a smaller margin in most cases when compared to GMMOD and both show improved performance over their similar comparative algorithms.
高精度离群点检测的参数和非参数方法
异常值检测是一个重要的问题,在各个领域都有广泛的应用。异常值检测的一种常用方法是使用统计模型,但这些方法存在固有的挑战和缺点。例如,在提供最优解决方案时,能够以更高的检测率更有效地检测异常值,并将计算成本降至最低。已经提出的许多统计技术主要分为参数方法和非参数方法,据我们所知,评估和破译这些方法相互之间的影响仍然是一个开放的研究方向,而且大多数这些统计方法都没有显示出很高的离群值检测精度。为了更有效地解决离群点检测问题,提高离群点检测精度,本文在统计方法的概括和推广下,提出了参数方法的高斯混合模型离群点检测(GMMOD)和非参数方法的核密度估计离群点检测(KDEOD)算法。我们的实验结果表明,尽管两种技术都表现良好,但与GMMOD相比,KDEOD在大多数情况下表现出较小的优势,并且两种方法都比类似的比较算法表现出更高的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Information Science and Engineering
Journal of Information Science and Engineering 工程技术-计算机:信息系统
CiteScore
2.00
自引率
0.00%
发文量
4
审稿时长
8 months
期刊介绍: The Journal of Information Science and Engineering is dedicated to the dissemination of information on computer science, computer engineering, and computer systems. This journal encourages articles on original research in the areas of computer hardware, software, man-machine interface, theory and applications. tutorial papers in the above-mentioned areas, and state-of-the-art papers on various aspects of computer systems and applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信