{"title":"Diagnostic report generation for macular diseases by natural language processing algorithms.","authors":"Xufeng Zhao,Chunshi Li,Jingyuan Yang,Xingwang Gu,Bing Li,Yuelin Wang,Bi-Lei Zhang,Xirong Li,Jianchun Zhao,Jie Wang,Weihong Yu","doi":"10.1136/bjo-2024-326064","DOIUrl":null,"url":null,"abstract":"AIMS\r\nTo investigate rule-based and deep learning (DL)-based methods for the automatically generating natural language diagnostic reports for macular diseases.\r\n\r\nMETHODS\r\nThis diagnostic study collected the ophthalmic images of 2261 eyes from 1303 patients. Colour fundus photographs and optical coherence tomography images were obtained. Eyes without retinal diseases as well as eyes diagnosed with four macular diseases were included. For each eye, a diagnostic report was written with a format consisting of lesion descriptions, diagnoses and recommendations. Subsequently, a rule-based natural language processing (NLP) and a DL-based NLP system were developed to automatically generate a diagnostic report. To assess the effectiveness of these models, two junior ophthalmologists wrote diagnostic reports for the collected images independently. A questionnaire was designed and judged by two retina specialists to grade each report's readability, correctness of diagnosis, lesion description and recommendations.\r\n\r\nRESULTS\r\nThe rule-based NLP reports achieved higher grades over junior ophthalmologists in correctness of diagnosis (9.13±1.52 vs 9.03±1.42 points) and recommendations (8.55±2.74 vs 8.50±2.53 points). Furthermore, the DL-based NLP reports got slightly lower grades to those of junior ophthalmologists in lesion description (8.82±1.84 vs 9.12±1.20 points, p<0.05), correctness of diagnosis (8.72±2.36 vs 9.08±1.55 points, p<0.05) and recommendations (8.81±2.52 vs 9.15±1.65 points, p<0.05). For readability, the DL-based reports performed better than junior ophthalmologists, with scores of 9.98±0.17 vs 9.94±0.25 points (p=0.094).\r\n\r\nCONCLUSIONS\r\nThe multimodal AI system, coupled with the NLP algorithm, has demonstrated competence in generating reports for four macular diseases compared with junior ophthalmologists.","PeriodicalId":9313,"journal":{"name":"British Journal of Ophthalmology","volume":"125 1","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"British Journal of Ophthalmology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1136/bjo-2024-326064","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
AIMS
To investigate rule-based and deep learning (DL)-based methods for the automatically generating natural language diagnostic reports for macular diseases.
METHODS
This diagnostic study collected the ophthalmic images of 2261 eyes from 1303 patients. Colour fundus photographs and optical coherence tomography images were obtained. Eyes without retinal diseases as well as eyes diagnosed with four macular diseases were included. For each eye, a diagnostic report was written with a format consisting of lesion descriptions, diagnoses and recommendations. Subsequently, a rule-based natural language processing (NLP) and a DL-based NLP system were developed to automatically generate a diagnostic report. To assess the effectiveness of these models, two junior ophthalmologists wrote diagnostic reports for the collected images independently. A questionnaire was designed and judged by two retina specialists to grade each report's readability, correctness of diagnosis, lesion description and recommendations.
RESULTS
The rule-based NLP reports achieved higher grades over junior ophthalmologists in correctness of diagnosis (9.13±1.52 vs 9.03±1.42 points) and recommendations (8.55±2.74 vs 8.50±2.53 points). Furthermore, the DL-based NLP reports got slightly lower grades to those of junior ophthalmologists in lesion description (8.82±1.84 vs 9.12±1.20 points, p<0.05), correctness of diagnosis (8.72±2.36 vs 9.08±1.55 points, p<0.05) and recommendations (8.81±2.52 vs 9.15±1.65 points, p<0.05). For readability, the DL-based reports performed better than junior ophthalmologists, with scores of 9.98±0.17 vs 9.94±0.25 points (p=0.094).
CONCLUSIONS
The multimodal AI system, coupled with the NLP algorithm, has demonstrated competence in generating reports for four macular diseases compared with junior ophthalmologists.
期刊介绍:
The British Journal of Ophthalmology (BJO) is an international peer-reviewed journal for ophthalmologists and visual science specialists. BJO publishes clinical investigations, clinical observations, and clinically relevant laboratory investigations related to ophthalmology. It also provides major reviews and also publishes manuscripts covering regional issues in a global context.