Katherine Du, Stavan Shah, Sandeep Chandra Bollepalli, Mohammed Nasar Ibrahim, Adarsh Gadari, Shan Sutharahan, José-Alain Sahel, Jay Chhablani, Kiran Kumar Vupparaboina
{"title":"标记质量和视网膜OCT扫描病理特征的评分者间可靠性:一种定制的注释软件方法。","authors":"Katherine Du, Stavan Shah, Sandeep Chandra Bollepalli, Mohammed Nasar Ibrahim, Adarsh Gadari, Shan Sutharahan, José-Alain Sahel, Jay Chhablani, Kiran Kumar Vupparaboina","doi":"10.1371/journal.pone.0314707","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Various imaging features on optical coherence tomography (OCT) are crucial for identifying and defining disease progression. Establishing a consensus on these imaging features is essential, particularly for training deep learning models for disease classification. This study aims to analyze the inter-rater reliability in labeling the quality and common imaging signatures of retinal OCT scans.</p><p><strong>Methods: </strong>500 OCT scans obtained from CIRRUS HD-OCT 5000 devices were displayed at 512x1024x128 resolution on a customizable, in-house annotation software. Each patient's eye was represented by 16 random scans. Two masked reviewers independently labeled the quality and specific pathological features of each scan. Evaluated features included overall image quality, presence of fovea, and disease signatures including subretinal fluid (SRF), intraretinal fluid (IRF), drusen, pigment epithelial detachment (PED), and hyperreflective material. The raw percentage agreement and Cohen's kappa (κ) coefficient were used to evaluate concurrence between the two sets of labels.</p><p><strong>Results: </strong>Our analysis revealed κ = 0.60 for the inter-rater reliability of overall scan quality, indicating substantial agreement. In contrast, there was slight agreement in determining the cause of poor image quality (κ = 0.18). The binary determination of presence and absence of retinal disease signatures showed almost complete agreement between reviewers (κ = 0.85). Specific retinal pathologies, such as the foveal location of the scan (0.78), IRF (0.63), drusen (0.73), and PED (0.87), exhibited substantial concordance. However, less agreement was found in identifying SRF (0.52), hyperreflective dots (0.41), and hyperreflective foci (0.33).</p><p><strong>Conclusions: </strong>Our study demonstrates significant inter-rater reliability in labeling the quality and retinal pathologies on OCT scans. While some features show stronger agreement than others, these standardized labels can be utilized to create automated machine learning tools for diagnosing retinal diseases and capturing valuable pathological features in each scan. This standardization will aid in the consistency of medical diagnoses and enhance the accessibility of OCT diagnostic tools.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"19 12","pages":"e0314707"},"PeriodicalIF":2.6000,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11654994/pdf/","citationCount":"0","resultStr":"{\"title\":\"Inter-rater reliability in labeling quality and pathological features of retinal OCT scans: A customized annotation software approach.\",\"authors\":\"Katherine Du, Stavan Shah, Sandeep Chandra Bollepalli, Mohammed Nasar Ibrahim, Adarsh Gadari, Shan Sutharahan, José-Alain Sahel, Jay Chhablani, Kiran Kumar Vupparaboina\",\"doi\":\"10.1371/journal.pone.0314707\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>Various imaging features on optical coherence tomography (OCT) are crucial for identifying and defining disease progression. Establishing a consensus on these imaging features is essential, particularly for training deep learning models for disease classification. This study aims to analyze the inter-rater reliability in labeling the quality and common imaging signatures of retinal OCT scans.</p><p><strong>Methods: </strong>500 OCT scans obtained from CIRRUS HD-OCT 5000 devices were displayed at 512x1024x128 resolution on a customizable, in-house annotation software. Each patient's eye was represented by 16 random scans. Two masked reviewers independently labeled the quality and specific pathological features of each scan. Evaluated features included overall image quality, presence of fovea, and disease signatures including subretinal fluid (SRF), intraretinal fluid (IRF), drusen, pigment epithelial detachment (PED), and hyperreflective material. The raw percentage agreement and Cohen's kappa (κ) coefficient were used to evaluate concurrence between the two sets of labels.</p><p><strong>Results: </strong>Our analysis revealed κ = 0.60 for the inter-rater reliability of overall scan quality, indicating substantial agreement. In contrast, there was slight agreement in determining the cause of poor image quality (κ = 0.18). The binary determination of presence and absence of retinal disease signatures showed almost complete agreement between reviewers (κ = 0.85). Specific retinal pathologies, such as the foveal location of the scan (0.78), IRF (0.63), drusen (0.73), and PED (0.87), exhibited substantial concordance. However, less agreement was found in identifying SRF (0.52), hyperreflective dots (0.41), and hyperreflective foci (0.33).</p><p><strong>Conclusions: </strong>Our study demonstrates significant inter-rater reliability in labeling the quality and retinal pathologies on OCT scans. While some features show stronger agreement than others, these standardized labels can be utilized to create automated machine learning tools for diagnosing retinal diseases and capturing valuable pathological features in each scan. This standardization will aid in the consistency of medical diagnoses and enhance the accessibility of OCT diagnostic tools.</p>\",\"PeriodicalId\":20189,\"journal\":{\"name\":\"PLoS ONE\",\"volume\":\"19 12\",\"pages\":\"e0314707\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-12-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11654994/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PLoS ONE\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1371/journal.pone.0314707\",\"RegionNum\":3,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0314707","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
Inter-rater reliability in labeling quality and pathological features of retinal OCT scans: A customized annotation software approach.
Objectives: Various imaging features on optical coherence tomography (OCT) are crucial for identifying and defining disease progression. Establishing a consensus on these imaging features is essential, particularly for training deep learning models for disease classification. This study aims to analyze the inter-rater reliability in labeling the quality and common imaging signatures of retinal OCT scans.
Methods: 500 OCT scans obtained from CIRRUS HD-OCT 5000 devices were displayed at 512x1024x128 resolution on a customizable, in-house annotation software. Each patient's eye was represented by 16 random scans. Two masked reviewers independently labeled the quality and specific pathological features of each scan. Evaluated features included overall image quality, presence of fovea, and disease signatures including subretinal fluid (SRF), intraretinal fluid (IRF), drusen, pigment epithelial detachment (PED), and hyperreflective material. The raw percentage agreement and Cohen's kappa (κ) coefficient were used to evaluate concurrence between the two sets of labels.
Results: Our analysis revealed κ = 0.60 for the inter-rater reliability of overall scan quality, indicating substantial agreement. In contrast, there was slight agreement in determining the cause of poor image quality (κ = 0.18). The binary determination of presence and absence of retinal disease signatures showed almost complete agreement between reviewers (κ = 0.85). Specific retinal pathologies, such as the foveal location of the scan (0.78), IRF (0.63), drusen (0.73), and PED (0.87), exhibited substantial concordance. However, less agreement was found in identifying SRF (0.52), hyperreflective dots (0.41), and hyperreflective foci (0.33).
Conclusions: Our study demonstrates significant inter-rater reliability in labeling the quality and retinal pathologies on OCT scans. While some features show stronger agreement than others, these standardized labels can be utilized to create automated machine learning tools for diagnosing retinal diseases and capturing valuable pathological features in each scan. This standardization will aid in the consistency of medical diagnoses and enhance the accessibility of OCT diagnostic tools.
期刊介绍:
PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides:
* Open-access—freely accessible online, authors retain copyright
* Fast publication times
* Peer review by expert, practicing researchers
* Post-publication tools to indicate quality and impact
* Community-based dialogue on articles
* Worldwide media coverage