Alexey Kochkarev, A. Khvostikov, Dmitry Korshunov, A. Krylov, M. Boguslavskiy
{"title":"训练分割神经网络的数据平衡方法","authors":"Alexey Kochkarev, A. Khvostikov, Dmitry Korshunov, A. Krylov, M. Boguslavskiy","doi":"10.51130/graphicon-2020-2-4-19","DOIUrl":null,"url":null,"abstract":"Data imbalance is a common problem in machine learning and image processing. The lack of training data for the rarest classes can lead to worse learning ability and negatively affect the quality of segmentation. In this paper, we focus on the problem of data balancing for the task of image segmentation. We review major trends in handling unbalanced data and propose a new method for data balancing, based on Distance Transform. This method is designed for using in segmentation convolutional neural networks (CNNs), but it is universal and can be used with any patch-based segmentation machine learning model. The evaluation of the proposed data balancing method is performed on two datasets. The first is medical dataset LiTS, containing CT images of liver with tumor abnormalities. The second one is a geological dataset, containing of photographs of polished sections of different ores. The proposed algorithm enhances the data balance between classes and improves the overall performance of CNN model.","PeriodicalId":344054,"journal":{"name":"Proceedings of the 30th International Conference on Computer Graphics and Machine Vision (GraphiCon 2020). Part 2","volume":"68 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Data Balancing Method for Training Segmentation Neural Networks\",\"authors\":\"Alexey Kochkarev, A. Khvostikov, Dmitry Korshunov, A. Krylov, M. Boguslavskiy\",\"doi\":\"10.51130/graphicon-2020-2-4-19\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data imbalance is a common problem in machine learning and image processing. The lack of training data for the rarest classes can lead to worse learning ability and negatively affect the quality of segmentation. In this paper, we focus on the problem of data balancing for the task of image segmentation. We review major trends in handling unbalanced data and propose a new method for data balancing, based on Distance Transform. This method is designed for using in segmentation convolutional neural networks (CNNs), but it is universal and can be used with any patch-based segmentation machine learning model. The evaluation of the proposed data balancing method is performed on two datasets. The first is medical dataset LiTS, containing CT images of liver with tumor abnormalities. The second one is a geological dataset, containing of photographs of polished sections of different ores. The proposed algorithm enhances the data balance between classes and improves the overall performance of CNN model.\",\"PeriodicalId\":344054,\"journal\":{\"name\":\"Proceedings of the 30th International Conference on Computer Graphics and Machine Vision (GraphiCon 2020). Part 2\",\"volume\":\"68 2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 30th International Conference on Computer Graphics and Machine Vision (GraphiCon 2020). Part 2\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.51130/graphicon-2020-2-4-19\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th International Conference on Computer Graphics and Machine Vision (GraphiCon 2020). Part 2","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.51130/graphicon-2020-2-4-19","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data Balancing Method for Training Segmentation Neural Networks
Data imbalance is a common problem in machine learning and image processing. The lack of training data for the rarest classes can lead to worse learning ability and negatively affect the quality of segmentation. In this paper, we focus on the problem of data balancing for the task of image segmentation. We review major trends in handling unbalanced data and propose a new method for data balancing, based on Distance Transform. This method is designed for using in segmentation convolutional neural networks (CNNs), but it is universal and can be used with any patch-based segmentation machine learning model. The evaluation of the proposed data balancing method is performed on two datasets. The first is medical dataset LiTS, containing CT images of liver with tumor abnormalities. The second one is a geological dataset, containing of photographs of polished sections of different ores. The proposed algorithm enhances the data balance between classes and improves the overall performance of CNN model.