Elizabeth E. Hwang PhD , Dake Chen PhD , Ying Han MD, PhD , Lin Jia PhD , Jing Shan MD, PhD
{"title":"基于图像的深度学习在多模态青光眼检测神经网络中的应用","authors":"Elizabeth E. Hwang PhD , Dake Chen PhD , Ying Han MD, PhD , Lin Jia PhD , Jing Shan MD, PhD","doi":"10.1016/j.xops.2025.100703","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>To develop a clinically motivated multimodal neural network <u>g</u>laucoma detection model trained on minimally processed imaging data of time-matched multimodal testing including fundus photographs, OCT scans, and Humphrey visual field (HVF) analysis.</div></div><div><h3>Design</h3><div>Evaluation of a diagnostic technology.</div></div><div><h3>Subjects</h3><div>A total of 716 encounters with time-matched fundus photographs, OCT optic nerve imaging, and HVF testing from 706 eyes (557 nonglaucomatous, 149 glaucomatous) from 571 individual patients seen at a tertiary medical center and 4 external single-modality (fundus photograph and OCT) datasets.</div></div><div><h3>Methods</h3><div>A multimodal neural network model was developed consisting of 2 main components: first, 3 convolutional neural networks to extract semantic features and generate embeddings for each respective modality, followed by a second component consisting of a multilayer perceptron to integrate the individual embeddings and produce a predicted label, glaucomatous or nonglaucomatous.</div></div><div><h3>Main Outcome Measures</h3><div>Single and multimodal performances were evaluated on the internal test set using the area under the receiver operating characteristic curve (AUC), accuracy, recall, and specificity. Fundus photograph and OCT single-modality neural networks were additionally evaluated on external datasets by these metrics.</div></div><div><h3>Results</h3><div>Our results show single-modality models with high performance on curated training datasets perform inferiorly on our primary clinical dataset. Performance metrics, however, can be notably improved through multimodal integration (AUC: 0.86 from 0.57 to 0.74 and specificity: 0.85 from 0.77 to 0.82), suggesting that a holistic approach considering both structural and functional data may enhance the functionality and accuracy of artificial intelligence (AI) model.</div></div><div><h3>Conclusions</h3><div>Clinical implementation of deep learning models for glaucoma detection benefits from multimodal integration, and we demonstrate this approach on a true clinical cohort to obtain a production-level AI solution for glaucoma diagnosis.</div></div><div><h3>Financial Disclosure(s)</h3><div>Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.</div></div>","PeriodicalId":74363,"journal":{"name":"Ophthalmology science","volume":"5 3","pages":"Article 100703"},"PeriodicalIF":3.2000,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Utilization of Image-Based Deep Learning in Multimodal Glaucoma Detection Neural Network from a Primary Patient Cohort\",\"authors\":\"Elizabeth E. Hwang PhD , Dake Chen PhD , Ying Han MD, PhD , Lin Jia PhD , Jing Shan MD, PhD\",\"doi\":\"10.1016/j.xops.2025.100703\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Purpose</h3><div>To develop a clinically motivated multimodal neural network <u>g</u>laucoma detection model trained on minimally processed imaging data of time-matched multimodal testing including fundus photographs, OCT scans, and Humphrey visual field (HVF) analysis.</div></div><div><h3>Design</h3><div>Evaluation of a diagnostic technology.</div></div><div><h3>Subjects</h3><div>A total of 716 encounters with time-matched fundus photographs, OCT optic nerve imaging, and HVF testing from 706 eyes (557 nonglaucomatous, 149 glaucomatous) from 571 individual patients seen at a tertiary medical center and 4 external single-modality (fundus photograph and OCT) datasets.</div></div><div><h3>Methods</h3><div>A multimodal neural network model was developed consisting of 2 main components: first, 3 convolutional neural networks to extract semantic features and generate embeddings for each respective modality, followed by a second component consisting of a multilayer perceptron to integrate the individual embeddings and produce a predicted label, glaucomatous or nonglaucomatous.</div></div><div><h3>Main Outcome Measures</h3><div>Single and multimodal performances were evaluated on the internal test set using the area under the receiver operating characteristic curve (AUC), accuracy, recall, and specificity. Fundus photograph and OCT single-modality neural networks were additionally evaluated on external datasets by these metrics.</div></div><div><h3>Results</h3><div>Our results show single-modality models with high performance on curated training datasets perform inferiorly on our primary clinical dataset. Performance metrics, however, can be notably improved through multimodal integration (AUC: 0.86 from 0.57 to 0.74 and specificity: 0.85 from 0.77 to 0.82), suggesting that a holistic approach considering both structural and functional data may enhance the functionality and accuracy of artificial intelligence (AI) model.</div></div><div><h3>Conclusions</h3><div>Clinical implementation of deep learning models for glaucoma detection benefits from multimodal integration, and we demonstrate this approach on a true clinical cohort to obtain a production-level AI solution for glaucoma diagnosis.</div></div><div><h3>Financial Disclosure(s)</h3><div>Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.</div></div>\",\"PeriodicalId\":74363,\"journal\":{\"name\":\"Ophthalmology science\",\"volume\":\"5 3\",\"pages\":\"Article 100703\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-01-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ophthalmology science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666914525000016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"OPHTHALMOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmology science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666914525000016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
Utilization of Image-Based Deep Learning in Multimodal Glaucoma Detection Neural Network from a Primary Patient Cohort
Purpose
To develop a clinically motivated multimodal neural network glaucoma detection model trained on minimally processed imaging data of time-matched multimodal testing including fundus photographs, OCT scans, and Humphrey visual field (HVF) analysis.
Design
Evaluation of a diagnostic technology.
Subjects
A total of 716 encounters with time-matched fundus photographs, OCT optic nerve imaging, and HVF testing from 706 eyes (557 nonglaucomatous, 149 glaucomatous) from 571 individual patients seen at a tertiary medical center and 4 external single-modality (fundus photograph and OCT) datasets.
Methods
A multimodal neural network model was developed consisting of 2 main components: first, 3 convolutional neural networks to extract semantic features and generate embeddings for each respective modality, followed by a second component consisting of a multilayer perceptron to integrate the individual embeddings and produce a predicted label, glaucomatous or nonglaucomatous.
Main Outcome Measures
Single and multimodal performances were evaluated on the internal test set using the area under the receiver operating characteristic curve (AUC), accuracy, recall, and specificity. Fundus photograph and OCT single-modality neural networks were additionally evaluated on external datasets by these metrics.
Results
Our results show single-modality models with high performance on curated training datasets perform inferiorly on our primary clinical dataset. Performance metrics, however, can be notably improved through multimodal integration (AUC: 0.86 from 0.57 to 0.74 and specificity: 0.85 from 0.77 to 0.82), suggesting that a holistic approach considering both structural and functional data may enhance the functionality and accuracy of artificial intelligence (AI) model.
Conclusions
Clinical implementation of deep learning models for glaucoma detection benefits from multimodal integration, and we demonstrate this approach on a true clinical cohort to obtain a production-level AI solution for glaucoma diagnosis.
Financial Disclosure(s)
Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.