{"title":"Selecting Indispensable Edge Patterns With Adaptive Sampling and Double Local Analysis for Data Description","authors":"Huina Li, Yuan Ping","doi":"10.4018/jcit.335945","DOIUrl":null,"url":null,"abstract":"Support vector data description (SVDD) inspires us in data analysis, adversarial training, and machine unlearning. However, collecting support vectors requires pricey computation, while the alternative boundary selection with O(N2) is still a challenge. The authors propose an indispensable edge pattern selection method (IEPS) for data description with direct SVDD model building. IEPS suggests a double local analysis to select the global edge patterns. Edge patterns belong to a subset of the target problem of SVDD and its variants, and neighbor analysis becomes pivotal. While an excessive number of participating data result in redundant computations, an insufficient number may impede data separability or compromise the model's quality. Consequently, a data-adaptive sampling strategy has been devised to ascertain an optimal ratio of retained data for edge pattern selection. Extensive experiments indicate that IEPS keeps indispensable edge patterns for data description while reducing the interference in the norm vector generation to guarantee the effectiveness for clustering analysis.","PeriodicalId":43384,"journal":{"name":"Journal of Cases on Information Technology","volume":null,"pages":null},"PeriodicalIF":0.7000,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cases on Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/jcit.335945","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Support vector data description (SVDD) inspires us in data analysis, adversarial training, and machine unlearning. However, collecting support vectors requires pricey computation, while the alternative boundary selection with O(N2) is still a challenge. The authors propose an indispensable edge pattern selection method (IEPS) for data description with direct SVDD model building. IEPS suggests a double local analysis to select the global edge patterns. Edge patterns belong to a subset of the target problem of SVDD and its variants, and neighbor analysis becomes pivotal. While an excessive number of participating data result in redundant computations, an insufficient number may impede data separability or compromise the model's quality. Consequently, a data-adaptive sampling strategy has been devised to ascertain an optimal ratio of retained data for edge pattern selection. Extensive experiments indicate that IEPS keeps indispensable edge patterns for data description while reducing the interference in the norm vector generation to guarantee the effectiveness for clustering analysis.
期刊介绍:
JCIT documents comprehensive, real-life cases based on individual, organizational and societal experiences related to the utilization and management of information technology. Cases published in JCIT deal with a wide variety of organizations such as businesses, government organizations, educational institutions, libraries, non-profit organizations. Additionally, cases published in JCIT report not only successful utilization of IT applications, but also failures and mismanagement of IT resources and applications.