{"title":"GO-MAE:通过掩码自动编码器进行自监督预训练,用于妇科 OCT 图像分类","authors":"Haoran Wang, Xinyu Guo, Kaiwen Song, Mingyang Sun, Yanbin Shao, Songfeng Xue, Hongwei Zhang, Tianyu Zhang","doi":"10.1016/j.neunet.2024.106817","DOIUrl":null,"url":null,"abstract":"<div><div>Genitourinary syndrome of menopause (GSM) is a physiological disorder caused by reduced levels of oestrogen in menopausal women. Gradually, its symptoms worsen with age and prolonged menopausal status, which gravely impacts the quality of life as well as the physical and mental health of the patients. In this regard, optical coherence tomography (OCT) system effectively reduces the patient’s burden in clinical diagnosis with its noncontact, noninvasive tomographic imaging process. Consequently, supervised computer vision models applied on OCT images have yielded excellent results for disease diagnosis. However, manual labeling on an extensive number of medical images is expensive and time-consuming. To this end, this paper proposes GO-MAE, a pretraining framework for self-supervised learning of GSM OCT images based on Masked Autoencoder (MAE). To the best of our knowledge, this is the first study that applies self-supervised learning methods on the field of GSM disease screening. Focusing on the semantic complexity and feature sparsity of GSM OCT images, the objective of this study is two-pronged: first, a dynamic masking strategy is introduced for OCT characteristics in downstream tasks. This method can reduce the interference of invalid features on the model and shorten the training time. In the encoder design of MAE, we propose a convolutional neural network and transformer parallel network architecture (C&T), which aims to fuse the local and global representations of the relevant lesions in an interactive manner such that the model can still learn the richer differences between the feature information without labels. Thereafter, a series of experimental results on the acquired GSM-OCT dataset revealed that GO-MAE yields significant improvements over existing state-of-the-art techniques. Furthermore, the superiority of the model in terms of robustness and interpretability was verified through a series of comparative experiments and visualization operations, which consequently demonstrated its great potential for screening GSM symptoms.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"181 ","pages":"Article 106817"},"PeriodicalIF":6.0000,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GO-MAE: Self-supervised pre-training via masked autoencoder for OCT image classification of gynecology\",\"authors\":\"Haoran Wang, Xinyu Guo, Kaiwen Song, Mingyang Sun, Yanbin Shao, Songfeng Xue, Hongwei Zhang, Tianyu Zhang\",\"doi\":\"10.1016/j.neunet.2024.106817\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Genitourinary syndrome of menopause (GSM) is a physiological disorder caused by reduced levels of oestrogen in menopausal women. Gradually, its symptoms worsen with age and prolonged menopausal status, which gravely impacts the quality of life as well as the physical and mental health of the patients. In this regard, optical coherence tomography (OCT) system effectively reduces the patient’s burden in clinical diagnosis with its noncontact, noninvasive tomographic imaging process. Consequently, supervised computer vision models applied on OCT images have yielded excellent results for disease diagnosis. However, manual labeling on an extensive number of medical images is expensive and time-consuming. To this end, this paper proposes GO-MAE, a pretraining framework for self-supervised learning of GSM OCT images based on Masked Autoencoder (MAE). To the best of our knowledge, this is the first study that applies self-supervised learning methods on the field of GSM disease screening. Focusing on the semantic complexity and feature sparsity of GSM OCT images, the objective of this study is two-pronged: first, a dynamic masking strategy is introduced for OCT characteristics in downstream tasks. This method can reduce the interference of invalid features on the model and shorten the training time. In the encoder design of MAE, we propose a convolutional neural network and transformer parallel network architecture (C&T), which aims to fuse the local and global representations of the relevant lesions in an interactive manner such that the model can still learn the richer differences between the feature information without labels. Thereafter, a series of experimental results on the acquired GSM-OCT dataset revealed that GO-MAE yields significant improvements over existing state-of-the-art techniques. Furthermore, the superiority of the model in terms of robustness and interpretability was verified through a series of comparative experiments and visualization operations, which consequently demonstrated its great potential for screening GSM symptoms.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"181 \",\"pages\":\"Article 106817\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2024-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S089360802400741X\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S089360802400741X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
更年期泌尿生殖系统综合征(GSM)是由更年期妇女体内雌激素水平降低引起的一种生理紊乱。随着年龄的增长和绝经期的延长,其症状会逐渐加重,严重影响患者的生活质量和身心健康。在这方面,光学相干断层扫描(OCT)系统以其非接触、非侵入性的断层成像过程,有效地减轻了患者的临床诊断负担。因此,应用于 OCT 图像的计算机视觉监督模型在疾病诊断方面取得了卓越的成果。然而,对大量医学图像进行人工标注既昂贵又耗时。为此,本文提出了 GO-MAE,一种基于掩码自动编码器(MAE)的 GSM OCT 图像自监督学习预训练框架。据我们所知,这是第一项将自监督学习方法应用于 GSM 疾病筛查领域的研究。针对 GSM OCT 图像的语义复杂性和特征稀疏性,本研究的目标是双管齐下的:首先,针对下游任务中的 OCT 特征引入动态掩蔽策略。这种方法可以减少无效特征对模型的干扰,缩短训练时间。在 MAE 的编码器设计中,我们提出了卷积神经网络和变压器并行网络架构(C&T),旨在以交互的方式融合相关病变的局部和全局表征,使模型在没有标签的情况下仍能学习到更丰富的差异特征信息。此后,在获取的 GSM-OCT 数据集上进行的一系列实验结果表明,GO-MAE 比现有的最先进技术有显著改进。此外,通过一系列对比实验和可视化操作,验证了该模型在鲁棒性和可解释性方面的优越性,从而证明了其在筛查 GSM 症状方面的巨大潜力。
GO-MAE: Self-supervised pre-training via masked autoencoder for OCT image classification of gynecology
Genitourinary syndrome of menopause (GSM) is a physiological disorder caused by reduced levels of oestrogen in menopausal women. Gradually, its symptoms worsen with age and prolonged menopausal status, which gravely impacts the quality of life as well as the physical and mental health of the patients. In this regard, optical coherence tomography (OCT) system effectively reduces the patient’s burden in clinical diagnosis with its noncontact, noninvasive tomographic imaging process. Consequently, supervised computer vision models applied on OCT images have yielded excellent results for disease diagnosis. However, manual labeling on an extensive number of medical images is expensive and time-consuming. To this end, this paper proposes GO-MAE, a pretraining framework for self-supervised learning of GSM OCT images based on Masked Autoencoder (MAE). To the best of our knowledge, this is the first study that applies self-supervised learning methods on the field of GSM disease screening. Focusing on the semantic complexity and feature sparsity of GSM OCT images, the objective of this study is two-pronged: first, a dynamic masking strategy is introduced for OCT characteristics in downstream tasks. This method can reduce the interference of invalid features on the model and shorten the training time. In the encoder design of MAE, we propose a convolutional neural network and transformer parallel network architecture (C&T), which aims to fuse the local and global representations of the relevant lesions in an interactive manner such that the model can still learn the richer differences between the feature information without labels. Thereafter, a series of experimental results on the acquired GSM-OCT dataset revealed that GO-MAE yields significant improvements over existing state-of-the-art techniques. Furthermore, the superiority of the model in terms of robustness and interpretability was verified through a series of comparative experiments and visualization operations, which consequently demonstrated its great potential for screening GSM symptoms.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.