{"title":"DIO: Efficient interactive outlier analysis over dynamic datasets","authors":"Chihiro Sakazume, H. Kitagawa, T. Amagasa","doi":"10.1109/ICDIM.2017.8244652","DOIUrl":null,"url":null,"abstract":"Outlier detection is an important data mining topic, and distance-based outlier detection is one of the representative methods. However, it is known that selecting parameter values suited for detecting outliers matching the user intent is not easy. To address this problem, an interactive outlier analysis framework named ONION was proposed. ONION analyzes datasets in advance and constructs index structures, which support several types of interactive outlier analysis and help users choose appropriate parameter values. However, ONION assumes static datasets, and updates to the datasets are not considered. In this work, we propose a novel scheme named DIO (Dynamic and Interactive Outlier analysis) to make ONION-like interactive outlier analysis applicable to dynamic datasets. DIO provides a grid structure for data objects and neighboring object counters to avoid expensive distance recomputations and enables efficient updates of the index structures. Intensive experiments prove that DIO achieves remarkable performance improvements.","PeriodicalId":144953,"journal":{"name":"2017 Twelfth International Conference on Digital Information Management (ICDIM)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Twelfth International Conference on Digital Information Management (ICDIM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDIM.2017.8244652","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Outlier detection is an important data mining topic, and distance-based outlier detection is one of the representative methods. However, it is known that selecting parameter values suited for detecting outliers matching the user intent is not easy. To address this problem, an interactive outlier analysis framework named ONION was proposed. ONION analyzes datasets in advance and constructs index structures, which support several types of interactive outlier analysis and help users choose appropriate parameter values. However, ONION assumes static datasets, and updates to the datasets are not considered. In this work, we propose a novel scheme named DIO (Dynamic and Interactive Outlier analysis) to make ONION-like interactive outlier analysis applicable to dynamic datasets. DIO provides a grid structure for data objects and neighboring object counters to avoid expensive distance recomputations and enables efficient updates of the index structures. Intensive experiments prove that DIO achieves remarkable performance improvements.