{"title":"用于长尾分类的 MinoritySalMix 和自适应语义权重补偿","authors":"Wu Zeng, Zheng-ying Xiao","doi":"10.1016/j.imavis.2024.105307","DOIUrl":null,"url":null,"abstract":"<div><div>In real-world datasets, the widespread presence of a long-tailed distribution often leads models to become overly biased towards majority class samples while ignoring minority class samples. We propose a strategy called MASW (MinoritySalMix and adaptive semantic weight compensation) to improve this problem. First, we propose a data augmentation method called MinoritySalMix (minority-saliency-mixing), which uses significance detection techniques to select significant regions from minority class samples as cropping regions and paste them into the same regions of majority class samples to generate brand new samples, thereby amplifying images containing important regions of minority class samples. Second, in order to make the label value information of the newly generated samples more consistent with the image content of the newly generated samples, we propose an adaptive semantic compensation factor. This factor provides more label value compensation for minority samples based on the different cropping areas, thereby making the new label values closer to the content of the newly generated samples. Improve model performance by generating more accurate new label value information. Finally, considering that some current re-sampling strategies generally lack flexibility in handling class sampling weight allocation and frequently require manual adjustment. We designed an adaptive weight function and incorporated it into the re-sampling strategy to achieve better sampling. The experimental results on three long-tailed datasets show that our method can effectively improve the performance of the model and is superior to most advanced long-tailed methods. Furthermore, we extended MinoritySalMix’s strategy to three balanced datasets for experimentation, and the results indicated that our method surpassed several advanced data augmentation techniques.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"152 ","pages":"Article 105307"},"PeriodicalIF":4.2000,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MinoritySalMix and adaptive semantic weight compensation for long-tailed classification\",\"authors\":\"Wu Zeng, Zheng-ying Xiao\",\"doi\":\"10.1016/j.imavis.2024.105307\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In real-world datasets, the widespread presence of a long-tailed distribution often leads models to become overly biased towards majority class samples while ignoring minority class samples. We propose a strategy called MASW (MinoritySalMix and adaptive semantic weight compensation) to improve this problem. First, we propose a data augmentation method called MinoritySalMix (minority-saliency-mixing), which uses significance detection techniques to select significant regions from minority class samples as cropping regions and paste them into the same regions of majority class samples to generate brand new samples, thereby amplifying images containing important regions of minority class samples. Second, in order to make the label value information of the newly generated samples more consistent with the image content of the newly generated samples, we propose an adaptive semantic compensation factor. This factor provides more label value compensation for minority samples based on the different cropping areas, thereby making the new label values closer to the content of the newly generated samples. Improve model performance by generating more accurate new label value information. Finally, considering that some current re-sampling strategies generally lack flexibility in handling class sampling weight allocation and frequently require manual adjustment. We designed an adaptive weight function and incorporated it into the re-sampling strategy to achieve better sampling. The experimental results on three long-tailed datasets show that our method can effectively improve the performance of the model and is superior to most advanced long-tailed methods. Furthermore, we extended MinoritySalMix’s strategy to three balanced datasets for experimentation, and the results indicated that our method surpassed several advanced data augmentation techniques.</div></div>\",\"PeriodicalId\":50374,\"journal\":{\"name\":\"Image and Vision Computing\",\"volume\":\"152 \",\"pages\":\"Article 105307\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2024-10-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image and Vision Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0262885624004128\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885624004128","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
MinoritySalMix and adaptive semantic weight compensation for long-tailed classification
In real-world datasets, the widespread presence of a long-tailed distribution often leads models to become overly biased towards majority class samples while ignoring minority class samples. We propose a strategy called MASW (MinoritySalMix and adaptive semantic weight compensation) to improve this problem. First, we propose a data augmentation method called MinoritySalMix (minority-saliency-mixing), which uses significance detection techniques to select significant regions from minority class samples as cropping regions and paste them into the same regions of majority class samples to generate brand new samples, thereby amplifying images containing important regions of minority class samples. Second, in order to make the label value information of the newly generated samples more consistent with the image content of the newly generated samples, we propose an adaptive semantic compensation factor. This factor provides more label value compensation for minority samples based on the different cropping areas, thereby making the new label values closer to the content of the newly generated samples. Improve model performance by generating more accurate new label value information. Finally, considering that some current re-sampling strategies generally lack flexibility in handling class sampling weight allocation and frequently require manual adjustment. We designed an adaptive weight function and incorporated it into the re-sampling strategy to achieve better sampling. The experimental results on three long-tailed datasets show that our method can effectively improve the performance of the model and is superior to most advanced long-tailed methods. Furthermore, we extended MinoritySalMix’s strategy to three balanced datasets for experimentation, and the results indicated that our method surpassed several advanced data augmentation techniques.
期刊介绍:
Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.