Haitao Gan, Zhi Yang, Ming Shi, Zhiwei Ye, Ran Zhou
{"title":"Improved safe semi-supervised clustering based on capped ℓ21 norm","authors":"Haitao Gan, Zhi Yang, Ming Shi, Zhiwei Ye, Ran Zhou","doi":"10.1016/j.fss.2025.109276","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, the concept of safe semi-supervised clustering (S3C) has received increasing attention within the semi-supervised learning community. Generally, existing S3C methods first analyze the risk of labeled instances and then try to mitigate the corresponding negative impacts through various risk-based regularization approaches. However, the adverse effects of high-probability mislabeled instances (HPMIs) are not eliminated, and corresponding useful discriminative information is not discovered effectively. To address these issues, we propose an improved S3C method based on capped <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>21</mn></mrow></msub></math></span> norm, called CapS3FCM. The motivation is that the capped <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>21</mn></mrow></msub></math></span> norm can effectively filter or find mislabeled instances. Consequently, CapS3FCM introduces two capped <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>21</mn></mrow></msub></math></span> norms. The first norm aims to make use of label information while simultaneously alleviating negative influences of mislabeled instances, especially HPMIs. The second norm further aims to discover useful discriminative information of those HPMIs. Finally, a loss function based on the capped <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>21</mn></mrow></msub></math></span> norms is built, and the optimization problem is solved using an efficient iterative optimization strategy. To verify the effectiveness of CapS3FCM, a series of experiments is carried out on several datasets, which demonstrate that CapS3FCM can outperform the other semi-supervised and S3C methods. These findings validate that the capped <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>21</mn></mrow></msub></math></span> norm is both practical and effective.</div></div>","PeriodicalId":55130,"journal":{"name":"Fuzzy Sets and Systems","volume":"505 ","pages":"Article 109276"},"PeriodicalIF":3.2000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fuzzy Sets and Systems","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165011425000156","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, the concept of safe semi-supervised clustering (S3C) has received increasing attention within the semi-supervised learning community. Generally, existing S3C methods first analyze the risk of labeled instances and then try to mitigate the corresponding negative impacts through various risk-based regularization approaches. However, the adverse effects of high-probability mislabeled instances (HPMIs) are not eliminated, and corresponding useful discriminative information is not discovered effectively. To address these issues, we propose an improved S3C method based on capped norm, called CapS3FCM. The motivation is that the capped norm can effectively filter or find mislabeled instances. Consequently, CapS3FCM introduces two capped norms. The first norm aims to make use of label information while simultaneously alleviating negative influences of mislabeled instances, especially HPMIs. The second norm further aims to discover useful discriminative information of those HPMIs. Finally, a loss function based on the capped norms is built, and the optimization problem is solved using an efficient iterative optimization strategy. To verify the effectiveness of CapS3FCM, a series of experiments is carried out on several datasets, which demonstrate that CapS3FCM can outperform the other semi-supervised and S3C methods. These findings validate that the capped norm is both practical and effective.
期刊介绍:
Since its launching in 1978, the journal Fuzzy Sets and Systems has been devoted to the international advancement of the theory and application of fuzzy sets and systems. The theory of fuzzy sets now encompasses a well organized corpus of basic notions including (and not restricted to) aggregation operations, a generalized theory of relations, specific measures of information content, a calculus of fuzzy numbers. Fuzzy sets are also the cornerstone of a non-additive uncertainty theory, namely possibility theory, and of a versatile tool for both linguistic and numerical modeling: fuzzy rule-based systems. Numerous works now combine fuzzy concepts with other scientific disciplines as well as modern technologies.
In mathematics fuzzy sets have triggered new research topics in connection with category theory, topology, algebra, analysis. Fuzzy sets are also part of a recent trend in the study of generalized measures and integrals, and are combined with statistical methods. Furthermore, fuzzy sets have strong logical underpinnings in the tradition of many-valued logics.