{"title":"基于卷积神经网络的视网膜疾病多模态分类。","authors":"Hongyi Pan, Jingpeng Miao, Jie Yu, Jingmin Li, Xiaobing Wang, Jihong Feng","doi":"10.1088/2057-1976/adeb92","DOIUrl":null,"url":null,"abstract":"<p><p>Retinal diseases such as age-related macular degeneration and diabetic retinopathy will lead to irreversible blindness without timely diagnosis and treatment. Optical coherence tomography (OCT) and optical coherence tomography angiography (OCTA) images provide complementary views of the retina, and the integration of the two imaging modalities can improve the accuracy of retinal disease classification. We propose a multi-modal classification model consisting of two branches to automatically diagnose retinal diseases, in which OCT and OCTA images are efficiently integrated to improve both the accuracy and efficiency of disease diagnosis. A bright line cropping is used to remove the useless black edge region while preserving the lesion features and reducing the calculation load. To solve the insufficient data issue, data enhancement and loose matching methods are adopted to increase the data amount. A two-step training method is used to train our proposed model, alleviating the limited training images. Our model is tested on an external test set instead of a training set, making the classification results more rigorous. The intermediate fusion and two-step training methods are adopted in our multiple classification model, achieving 0.9667, 0.9418, 0.8569, 0.9422, and 0.8921 in average accuracy, precision, recall, specificity, and F1-Score, respectively.
Our multi-modal model outperforms the single-modal model, the early, and late fusion multi-modal model in accuracy. Our model offers doctors less human error, lower cost, more uniform, and effective mass screening, thus providing a solution to improve deep learning performance in terms of a relatively fewer number of training data and even more imbalanced classes.
.</p>","PeriodicalId":8896,"journal":{"name":"Biomedical Physics & Engineering Express","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-modal Classification of Retinal Disease Based On Convolutional Neural Network.\",\"authors\":\"Hongyi Pan, Jingpeng Miao, Jie Yu, Jingmin Li, Xiaobing Wang, Jihong Feng\",\"doi\":\"10.1088/2057-1976/adeb92\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Retinal diseases such as age-related macular degeneration and diabetic retinopathy will lead to irreversible blindness without timely diagnosis and treatment. Optical coherence tomography (OCT) and optical coherence tomography angiography (OCTA) images provide complementary views of the retina, and the integration of the two imaging modalities can improve the accuracy of retinal disease classification. We propose a multi-modal classification model consisting of two branches to automatically diagnose retinal diseases, in which OCT and OCTA images are efficiently integrated to improve both the accuracy and efficiency of disease diagnosis. A bright line cropping is used to remove the useless black edge region while preserving the lesion features and reducing the calculation load. To solve the insufficient data issue, data enhancement and loose matching methods are adopted to increase the data amount. A two-step training method is used to train our proposed model, alleviating the limited training images. Our model is tested on an external test set instead of a training set, making the classification results more rigorous. The intermediate fusion and two-step training methods are adopted in our multiple classification model, achieving 0.9667, 0.9418, 0.8569, 0.9422, and 0.8921 in average accuracy, precision, recall, specificity, and F1-Score, respectively.
Our multi-modal model outperforms the single-modal model, the early, and late fusion multi-modal model in accuracy. Our model offers doctors less human error, lower cost, more uniform, and effective mass screening, thus providing a solution to improve deep learning performance in terms of a relatively fewer number of training data and even more imbalanced classes.
.</p>\",\"PeriodicalId\":8896,\"journal\":{\"name\":\"Biomedical Physics & Engineering Express\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2025-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biomedical Physics & Engineering Express\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1088/2057-1976/adeb92\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Physics & Engineering Express","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2057-1976/adeb92","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
Multi-modal Classification of Retinal Disease Based On Convolutional Neural Network.
Retinal diseases such as age-related macular degeneration and diabetic retinopathy will lead to irreversible blindness without timely diagnosis and treatment. Optical coherence tomography (OCT) and optical coherence tomography angiography (OCTA) images provide complementary views of the retina, and the integration of the two imaging modalities can improve the accuracy of retinal disease classification. We propose a multi-modal classification model consisting of two branches to automatically diagnose retinal diseases, in which OCT and OCTA images are efficiently integrated to improve both the accuracy and efficiency of disease diagnosis. A bright line cropping is used to remove the useless black edge region while preserving the lesion features and reducing the calculation load. To solve the insufficient data issue, data enhancement and loose matching methods are adopted to increase the data amount. A two-step training method is used to train our proposed model, alleviating the limited training images. Our model is tested on an external test set instead of a training set, making the classification results more rigorous. The intermediate fusion and two-step training methods are adopted in our multiple classification model, achieving 0.9667, 0.9418, 0.8569, 0.9422, and 0.8921 in average accuracy, precision, recall, specificity, and F1-Score, respectively.
Our multi-modal model outperforms the single-modal model, the early, and late fusion multi-modal model in accuracy. Our model offers doctors less human error, lower cost, more uniform, and effective mass screening, thus providing a solution to improve deep learning performance in terms of a relatively fewer number of training data and even more imbalanced classes.
.
期刊介绍:
BPEX is an inclusive, international, multidisciplinary journal devoted to publishing new research on any application of physics and/or engineering in medicine and/or biology. Characterized by a broad geographical coverage and a fast-track peer-review process, relevant topics include all aspects of biophysics, medical physics and biomedical engineering. Papers that are almost entirely clinical or biological in their focus are not suitable. The journal has an emphasis on publishing interdisciplinary work and bringing research fields together, encompassing experimental, theoretical and computational work.