{"title":"用于分割目的的合成脑电子显微镜数据集的生成与研究","authors":"N. Sokolov, E. Vasiliev, A. Getmanskaya","doi":"10.20948/graphicon-2022-706-714","DOIUrl":null,"url":null,"abstract":"Advanced microscopy technologies such as electron microscopy have opened up a new field of vision for biomedical researchers. The use of artificial intelligence methods for processing EM data is largely difficult due to the small amount of annotated data at the training stage. Therefore, we add synthetic images to an annotated real EM dataset or use a fully synthetic training dataset. In this work, we present an algorithm for the synthesis of 6 types of organelles. Based on the EPFL dataset, a training set of 860 real fragments 256x256 (ORG) and 6000 synthetic ones (SYN), as well as their combination (MIX), were generated. An experiment of training models for segmentation into 5 and 6 classes showed that, despite the imperfection of synthetic data, for an axon poorly represented in the training data set, the use of a synthetic data set improves the Dice metric from 0.3 on the original dataset to 0.8 on the mixed and synthetic datasets. The synthetic data strategy gives annotations for free, but shifts the effort to producing sufficiently realistic images.","PeriodicalId":299055,"journal":{"name":"Proceedings of the 32nd International Conference on Computer Graphics and Vision","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generation and Study of the Synthetic Brain Electron Microscopy Dataset for Segmentation Purpose\",\"authors\":\"N. Sokolov, E. Vasiliev, A. Getmanskaya\",\"doi\":\"10.20948/graphicon-2022-706-714\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Advanced microscopy technologies such as electron microscopy have opened up a new field of vision for biomedical researchers. The use of artificial intelligence methods for processing EM data is largely difficult due to the small amount of annotated data at the training stage. Therefore, we add synthetic images to an annotated real EM dataset or use a fully synthetic training dataset. In this work, we present an algorithm for the synthesis of 6 types of organelles. Based on the EPFL dataset, a training set of 860 real fragments 256x256 (ORG) and 6000 synthetic ones (SYN), as well as their combination (MIX), were generated. An experiment of training models for segmentation into 5 and 6 classes showed that, despite the imperfection of synthetic data, for an axon poorly represented in the training data set, the use of a synthetic data set improves the Dice metric from 0.3 on the original dataset to 0.8 on the mixed and synthetic datasets. The synthetic data strategy gives annotations for free, but shifts the effort to producing sufficiently realistic images.\",\"PeriodicalId\":299055,\"journal\":{\"name\":\"Proceedings of the 32nd International Conference on Computer Graphics and Vision\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 32nd International Conference on Computer Graphics and Vision\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.20948/graphicon-2022-706-714\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 32nd International Conference on Computer Graphics and Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20948/graphicon-2022-706-714","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Generation and Study of the Synthetic Brain Electron Microscopy Dataset for Segmentation Purpose
Advanced microscopy technologies such as electron microscopy have opened up a new field of vision for biomedical researchers. The use of artificial intelligence methods for processing EM data is largely difficult due to the small amount of annotated data at the training stage. Therefore, we add synthetic images to an annotated real EM dataset or use a fully synthetic training dataset. In this work, we present an algorithm for the synthesis of 6 types of organelles. Based on the EPFL dataset, a training set of 860 real fragments 256x256 (ORG) and 6000 synthetic ones (SYN), as well as their combination (MIX), were generated. An experiment of training models for segmentation into 5 and 6 classes showed that, despite the imperfection of synthetic data, for an axon poorly represented in the training data set, the use of a synthetic data set improves the Dice metric from 0.3 on the original dataset to 0.8 on the mixed and synthetic datasets. The synthetic data strategy gives annotations for free, but shifts the effort to producing sufficiently realistic images.