{"title":"Anomaly Detection in XML databases by means of Association Rules","authors":"G. Bruno, P. Garza, E. Quintarelli, R. Rossato","doi":"10.1109/DEXA.2007.68","DOIUrl":null,"url":null,"abstract":"Anomaly detection has the double purpose of discovering interesting exceptions and identifying incorrect data in huge amounts of data. Since anomalies are rare events which violate the frequent relationships among data, we propose a method to detect frequent relationships and then extract anomalies. The RADAR (Research of Anomalous Data through Association Rules) method is based on data mining techniques to extract frequent \"rules\" from datasets, in the form of quasi-functional dependencies. Such dependencies are extracted by using association rules. Given a quasi-functional dependency, we can discover the associated anomalies by querying either the original database or the association rules previously mined. The analysis on this kind of anomaly can either derive the presence of erroneous data or highlight novel information which represents significant outliers of frequent rules. Our method does not require any previous knowledge and directly infers rules from the data. Experiments performed on real XML databases are reported to show the applicability and effectiveness of the proposed approach.","PeriodicalId":314834,"journal":{"name":"18th International Workshop on Database and Expert Systems Applications (DEXA 2007)","volume":"427 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"18th International Workshop on Database and Expert Systems Applications (DEXA 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEXA.2007.68","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21
Abstract
Anomaly detection has the double purpose of discovering interesting exceptions and identifying incorrect data in huge amounts of data. Since anomalies are rare events which violate the frequent relationships among data, we propose a method to detect frequent relationships and then extract anomalies. The RADAR (Research of Anomalous Data through Association Rules) method is based on data mining techniques to extract frequent "rules" from datasets, in the form of quasi-functional dependencies. Such dependencies are extracted by using association rules. Given a quasi-functional dependency, we can discover the associated anomalies by querying either the original database or the association rules previously mined. The analysis on this kind of anomaly can either derive the presence of erroneous data or highlight novel information which represents significant outliers of frequent rules. Our method does not require any previous knowledge and directly infers rules from the data. Experiments performed on real XML databases are reported to show the applicability and effectiveness of the proposed approach.
异常检测有两个目的,一是发现有趣的异常,二是在海量数据中识别不正确的数据。由于异常是违反数据间频繁关系的罕见事件,我们提出了一种检测频繁关系并提取异常的方法。RADAR (Research of Anomalous Data through Association Rules)方法基于数据挖掘技术,以准功能依赖关系的形式从数据集中提取频繁的“规则”。这些依赖关系是通过使用关联规则提取的。给定一个准功能依赖,我们可以通过查询原始数据库或先前挖掘的关联规则来发现相关的异常。对这类异常的分析既可以得出错误数据的存在,也可以突出新的信息,这些信息代表了频繁规则的显著异常值。我们的方法不需要任何先前的知识,直接从数据中推断出规则。在实际XML数据库上进行的实验表明了该方法的适用性和有效性。