{"title":"数据驱动异常检测的优化研究","authors":"Yiqing Zhou, Rui Liao, Yong-hong Chen","doi":"10.1109/ICoDSA55874.2022.9862914","DOIUrl":null,"url":null,"abstract":"In the paper, according to the original data and the value of the sensor at different moments, the box diagram method is used to process the data, and divides the normal value and outliers. The two types of outliers were distinguished based on the persistence of the outliers in the longitudinal time of the data and the linkage of the lateral sensors, and the clustering algorithm was used to reclassify the data. Then, persistence and linkage were calculated within each class, dividing the sum of persistence and linkage by the result of the maximum number of possible anomalies as the risk coefficient, and then defining a threshold to distinguish between risk-specific and non-risk anomalies. Later, a comprehensive evaluation model of anomaly degree was established through quantitative score, principal component analysis and 0,1 planning. Finally, this quantitative evaluation method is evaluated objectively.","PeriodicalId":339135,"journal":{"name":"2022 International Conference on Data Science and Its Applications (ICoDSA)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Study on Optimization of Data-Driven Anomaly Detection\",\"authors\":\"Yiqing Zhou, Rui Liao, Yong-hong Chen\",\"doi\":\"10.1109/ICoDSA55874.2022.9862914\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the paper, according to the original data and the value of the sensor at different moments, the box diagram method is used to process the data, and divides the normal value and outliers. The two types of outliers were distinguished based on the persistence of the outliers in the longitudinal time of the data and the linkage of the lateral sensors, and the clustering algorithm was used to reclassify the data. Then, persistence and linkage were calculated within each class, dividing the sum of persistence and linkage by the result of the maximum number of possible anomalies as the risk coefficient, and then defining a threshold to distinguish between risk-specific and non-risk anomalies. Later, a comprehensive evaluation model of anomaly degree was established through quantitative score, principal component analysis and 0,1 planning. Finally, this quantitative evaluation method is evaluated objectively.\",\"PeriodicalId\":339135,\"journal\":{\"name\":\"2022 International Conference on Data Science and Its Applications (ICoDSA)\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Data Science and Its Applications (ICoDSA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICoDSA55874.2022.9862914\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Data Science and Its Applications (ICoDSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICoDSA55874.2022.9862914","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Study on Optimization of Data-Driven Anomaly Detection
In the paper, according to the original data and the value of the sensor at different moments, the box diagram method is used to process the data, and divides the normal value and outliers. The two types of outliers were distinguished based on the persistence of the outliers in the longitudinal time of the data and the linkage of the lateral sensors, and the clustering algorithm was used to reclassify the data. Then, persistence and linkage were calculated within each class, dividing the sum of persistence and linkage by the result of the maximum number of possible anomalies as the risk coefficient, and then defining a threshold to distinguish between risk-specific and non-risk anomalies. Later, a comprehensive evaluation model of anomaly degree was established through quantitative score, principal component analysis and 0,1 planning. Finally, this quantitative evaluation method is evaluated objectively.