{"title":"MT-ONet:用于医学图像分割的混合变压器O-Net","authors":"Pengfei Zheng","doi":"10.1109/ICSMD57530.2022.10058445","DOIUrl":null,"url":null,"abstract":"In the past few years, the deep learning is widely used in the medical industry due to its advantage. Constructed using Convolutional Neural Networks (CNN), the U-Net framework has become the industry standard for solving medical image segmentation tasks. Nonetheless, this framework is incapable of entirely learning all global and remote semantic information. It has been demonstrated that the transformer structure collects more global information than U-Net but less local information than CNN. To improve the performance of segmentation and classification in medical images while maximizing global and local data, we integrate O-Net with Mixed Transformer [1], this fuses the advantages of CNN and Transformer. This enables us to maximize both types of data. We combine CNN, Mixed Transformer, and Local-Global Gaussian-Weighted Self-Attention (LGG-SA) in the encoder component of our proposed O-Net architecture to obtain more global and local background information. The decoder part combines the Mixed Transformer and CNN blocks to obtain the results. The segmentation capability of the proposed network is evaluated by the multi-organ CT dataset containing synaptic information. The results of our trials demonstrate that the proposed MT-ONet can deliver superior segmentation performance relative to cutting-edge methods, resulting in improved classification precision.","PeriodicalId":396735,"journal":{"name":"2022 International Conference on Sensing, Measurement & Data Analytics in the era of Artificial Intelligence (ICSMD)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MT-ONet: Mixed Transformer O-Net for Medical Image Segmentation\",\"authors\":\"Pengfei Zheng\",\"doi\":\"10.1109/ICSMD57530.2022.10058445\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the past few years, the deep learning is widely used in the medical industry due to its advantage. Constructed using Convolutional Neural Networks (CNN), the U-Net framework has become the industry standard for solving medical image segmentation tasks. Nonetheless, this framework is incapable of entirely learning all global and remote semantic information. It has been demonstrated that the transformer structure collects more global information than U-Net but less local information than CNN. To improve the performance of segmentation and classification in medical images while maximizing global and local data, we integrate O-Net with Mixed Transformer [1], this fuses the advantages of CNN and Transformer. This enables us to maximize both types of data. We combine CNN, Mixed Transformer, and Local-Global Gaussian-Weighted Self-Attention (LGG-SA) in the encoder component of our proposed O-Net architecture to obtain more global and local background information. The decoder part combines the Mixed Transformer and CNN blocks to obtain the results. The segmentation capability of the proposed network is evaluated by the multi-organ CT dataset containing synaptic information. The results of our trials demonstrate that the proposed MT-ONet can deliver superior segmentation performance relative to cutting-edge methods, resulting in improved classification precision.\",\"PeriodicalId\":396735,\"journal\":{\"name\":\"2022 International Conference on Sensing, Measurement & Data Analytics in the era of Artificial Intelligence (ICSMD)\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Sensing, Measurement & Data Analytics in the era of Artificial Intelligence (ICSMD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSMD57530.2022.10058445\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Sensing, Measurement & Data Analytics in the era of Artificial Intelligence (ICSMD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSMD57530.2022.10058445","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MT-ONet: Mixed Transformer O-Net for Medical Image Segmentation
In the past few years, the deep learning is widely used in the medical industry due to its advantage. Constructed using Convolutional Neural Networks (CNN), the U-Net framework has become the industry standard for solving medical image segmentation tasks. Nonetheless, this framework is incapable of entirely learning all global and remote semantic information. It has been demonstrated that the transformer structure collects more global information than U-Net but less local information than CNN. To improve the performance of segmentation and classification in medical images while maximizing global and local data, we integrate O-Net with Mixed Transformer [1], this fuses the advantages of CNN and Transformer. This enables us to maximize both types of data. We combine CNN, Mixed Transformer, and Local-Global Gaussian-Weighted Self-Attention (LGG-SA) in the encoder component of our proposed O-Net architecture to obtain more global and local background information. The decoder part combines the Mixed Transformer and CNN blocks to obtain the results. The segmentation capability of the proposed network is evaluated by the multi-organ CT dataset containing synaptic information. The results of our trials demonstrate that the proposed MT-ONet can deliver superior segmentation performance relative to cutting-edge methods, resulting in improved classification precision.