Nan Cheng , Zhe Zhang , Jing Pan , Xiao-Na Li , Wei-Yi Chen , Guang-Hua Zhang , Wei-Hua Yang
{"title":"MCSTransWnet:基于 Pentacam HR 系统原始多模态数据的术后角膜地形图预测深度学习新流程","authors":"Nan Cheng , Zhe Zhang , Jing Pan , Xiao-Na Li , Wei-Yi Chen , Guang-Hua Zhang , Wei-Hua Yang","doi":"10.1016/j.medntd.2023.100267","DOIUrl":null,"url":null,"abstract":"<div><p>This work provides a new multimodal fusion generative adversarial net (GAN) model, Multiple Conditions Transform W-net (MCSTransWnet), which primarily uses femtosecond laser arcuate keratotomy surgical parameters and preoperative corneal topography to predict postoperative corneal topography in astigmatism-corrected patients. The MCSTransWnet model comprises a generator and a discriminator, and the generator is composed of two sub-generators. The first sub-generator extracts features using the U-net model, vision transform (ViT) and a multi-parameter conditional module branch. The second sub-generator uses a U-net network for further image denoising. The discriminator uses the pixel discriminator in Pix2Pix. Currently, most GAN models are convolutional neural networks; however, due to their feature extraction locality, it is difficult to comprehend the relationships among global features. Thus, we added a vision Transform network as the model branch to extract the global features. It is normally difficult to train the transformer, and image noise and geometric information loss are likely. Hence, we adopted the standard U-net fusion scheme and transform network as the generator, so that global features, local features, and rich image details could be obtained simultaneously. Our experimental results clearly demonstrate that MCSTransWnet successfully predicts postoperative corneal topographies (structural similarity = 0.765, peak signal-to-noise ratio = 16.012, and Fréchet inception distance = 9.264). Using this technique to obtain the rough shape of the postoperative corneal topography in advance gives clinicians more references and guides changes to surgical planning and improves the success rate of surgery.</p></div>","PeriodicalId":33783,"journal":{"name":"Medicine in Novel Technology and Devices","volume":"21 ","pages":"Article 100267"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590093523000620/pdfft?md5=412058a35f3f2c21895df898f0a2d018&pid=1-s2.0-S2590093523000620-main.pdf","citationCount":"0","resultStr":"{\"title\":\"MCSTransWnet: A new deep learning process for postoperative corneal topography prediction based on raw multimodal data from the Pentacam HR system\",\"authors\":\"Nan Cheng , Zhe Zhang , Jing Pan , Xiao-Na Li , Wei-Yi Chen , Guang-Hua Zhang , Wei-Hua Yang\",\"doi\":\"10.1016/j.medntd.2023.100267\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This work provides a new multimodal fusion generative adversarial net (GAN) model, Multiple Conditions Transform W-net (MCSTransWnet), which primarily uses femtosecond laser arcuate keratotomy surgical parameters and preoperative corneal topography to predict postoperative corneal topography in astigmatism-corrected patients. The MCSTransWnet model comprises a generator and a discriminator, and the generator is composed of two sub-generators. The first sub-generator extracts features using the U-net model, vision transform (ViT) and a multi-parameter conditional module branch. The second sub-generator uses a U-net network for further image denoising. The discriminator uses the pixel discriminator in Pix2Pix. Currently, most GAN models are convolutional neural networks; however, due to their feature extraction locality, it is difficult to comprehend the relationships among global features. Thus, we added a vision Transform network as the model branch to extract the global features. It is normally difficult to train the transformer, and image noise and geometric information loss are likely. Hence, we adopted the standard U-net fusion scheme and transform network as the generator, so that global features, local features, and rich image details could be obtained simultaneously. Our experimental results clearly demonstrate that MCSTransWnet successfully predicts postoperative corneal topographies (structural similarity = 0.765, peak signal-to-noise ratio = 16.012, and Fréchet inception distance = 9.264). Using this technique to obtain the rough shape of the postoperative corneal topography in advance gives clinicians more references and guides changes to surgical planning and improves the success rate of surgery.</p></div>\",\"PeriodicalId\":33783,\"journal\":{\"name\":\"Medicine in Novel Technology and Devices\",\"volume\":\"21 \",\"pages\":\"Article 100267\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2590093523000620/pdfft?md5=412058a35f3f2c21895df898f0a2d018&pid=1-s2.0-S2590093523000620-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medicine in Novel Technology and Devices\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2590093523000620\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medicine in Novel Technology and Devices","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590093523000620","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0
摘要
这项研究提供了一种新的多模态融合生成对抗网(GAN)模型--多条件变换 W 网(MCSTransWnet),它主要利用飞秒激光弧形角膜切开术的手术参数和术前角膜地形图来预测散光矫正患者的术后角膜地形图。MCSTransWnet 模型包括一个生成器和一个判别器,生成器由两个子生成器组成。第一个子生成器使用 U 网模型、视觉变换(ViT)和多参数条件模块分支提取特征。第二个子生成器使用 U-net 网络对图像进行进一步去噪。判别器使用 Pix2Pix 中的像素判别器。目前,大多数 GAN 模型都是卷积神经网络;然而,由于其特征提取的局部性,很难理解全局特征之间的关系。因此,我们添加了视觉变换网络作为模型分支来提取全局特征。通常,变换器的训练比较困难,而且很可能出现图像噪声和几何信息丢失。因此,我们采用了标准的 U 网融合方案,并以变换网络作为生成器,这样就能同时获得全局特征、局部特征和丰富的图像细节。实验结果清楚地表明,MCSTransWnet 能成功预测术后角膜地形图(结构相似度 = 0.765,峰值信噪比 = 16.012,弗雷谢特起始距离 = 9.264)。使用该技术提前获得术后角膜地形图的大致形状,可为临床医生提供更多参考,指导手术计划的更改,提高手术成功率。
MCSTransWnet: A new deep learning process for postoperative corneal topography prediction based on raw multimodal data from the Pentacam HR system
This work provides a new multimodal fusion generative adversarial net (GAN) model, Multiple Conditions Transform W-net (MCSTransWnet), which primarily uses femtosecond laser arcuate keratotomy surgical parameters and preoperative corneal topography to predict postoperative corneal topography in astigmatism-corrected patients. The MCSTransWnet model comprises a generator and a discriminator, and the generator is composed of two sub-generators. The first sub-generator extracts features using the U-net model, vision transform (ViT) and a multi-parameter conditional module branch. The second sub-generator uses a U-net network for further image denoising. The discriminator uses the pixel discriminator in Pix2Pix. Currently, most GAN models are convolutional neural networks; however, due to their feature extraction locality, it is difficult to comprehend the relationships among global features. Thus, we added a vision Transform network as the model branch to extract the global features. It is normally difficult to train the transformer, and image noise and geometric information loss are likely. Hence, we adopted the standard U-net fusion scheme and transform network as the generator, so that global features, local features, and rich image details could be obtained simultaneously. Our experimental results clearly demonstrate that MCSTransWnet successfully predicts postoperative corneal topographies (structural similarity = 0.765, peak signal-to-noise ratio = 16.012, and Fréchet inception distance = 9.264). Using this technique to obtain the rough shape of the postoperative corneal topography in advance gives clinicians more references and guides changes to surgical planning and improves the success rate of surgery.