{"title":"Probing clarity: AI-generated simplified breast imaging reports for enhanced patient comprehension powered by ChatGPT-4o.","authors":"Roberto Maroncelli, Veronica Rizzo, Marcella Pasculli, Federica Cicciarelli, Massimo Macera, Francesca Galati, Carlo Catalano, Federica Pediconi","doi":"10.1186/s41747-024-00526-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>To assess the reliability and comprehensibility of breast radiology reports simplified by artificial intelligence using the large language model (LLM) ChatGPT-4o.</p><p><strong>Methods: </strong>A radiologist with 20 years' experience selected 21 anonymized breast radiology reports, 7 mammography, 7 breast ultrasound, and 7 breast magnetic resonance imaging (MRI), categorized according to breast imaging reporting and data system (BI-RADS). These reports underwent simplification by prompting ChatGPT-4o with \"Explain this medical report to a patient using simple language\". Five breast radiologists assessed the quality of these simplified reports for factual accuracy, completeness, and potential harm with a 5-point Likert scale from 1 (strongly agree) to 5 (strongly disagree). Another breast radiologist evaluated the text comprehension of five non-healthcare personnel readers using a 5-point Likert scale from 1 (excellent) to 5 (poor). Descriptive statistics, Cronbach's α, and the Kruskal-Wallis test were used.</p><p><strong>Results: </strong>Mammography, ultrasound, and MRI showed high factual accuracy (median 2) and completeness (median 2) across radiologists, with low potential harm scores (median 5); no significant group differences (p ≥ 0.780), and high internal consistency (α > 0.80) were observed. Non-healthcare readers showed high comprehension (median 2 for mammography and MRI and 1 for ultrasound); no significant group differences across modalities (p = 0.368), and high internal consistency (α > 0.85) were observed. BI-RADS 0, 1, and 2 reports were accurately explained, while BI-RADS 3-6 reports were challenging.</p><p><strong>Conclusion: </strong>The model demonstrated reliability and clarity, offering promise for patients with diverse backgrounds. LLMs like ChatGPT-4o could simplify breast radiology reports, aid in communication, and enhance patient care.</p><p><strong>Relevance statement: </strong>Simplified breast radiology reports generated by ChatGPT-4o show potential in enhancing communication with patients, improving comprehension across varying educational backgrounds, and contributing to patient-centered care in radiology practice.</p><p><strong>Key points: </strong>AI simplifies complex breast imaging reports, enhancing patient understanding. Simplified reports from AI maintain accuracy, improving patient comprehension significantly. Implementing AI reports enhances patient engagement and communication in breast imaging.</p>","PeriodicalId":36926,"journal":{"name":"European Radiology Experimental","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11525358/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Radiology Experimental","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s41747-024-00526-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Background: To assess the reliability and comprehensibility of breast radiology reports simplified by artificial intelligence using the large language model (LLM) ChatGPT-4o.
Methods: A radiologist with 20 years' experience selected 21 anonymized breast radiology reports, 7 mammography, 7 breast ultrasound, and 7 breast magnetic resonance imaging (MRI), categorized according to breast imaging reporting and data system (BI-RADS). These reports underwent simplification by prompting ChatGPT-4o with "Explain this medical report to a patient using simple language". Five breast radiologists assessed the quality of these simplified reports for factual accuracy, completeness, and potential harm with a 5-point Likert scale from 1 (strongly agree) to 5 (strongly disagree). Another breast radiologist evaluated the text comprehension of five non-healthcare personnel readers using a 5-point Likert scale from 1 (excellent) to 5 (poor). Descriptive statistics, Cronbach's α, and the Kruskal-Wallis test were used.
Results: Mammography, ultrasound, and MRI showed high factual accuracy (median 2) and completeness (median 2) across radiologists, with low potential harm scores (median 5); no significant group differences (p ≥ 0.780), and high internal consistency (α > 0.80) were observed. Non-healthcare readers showed high comprehension (median 2 for mammography and MRI and 1 for ultrasound); no significant group differences across modalities (p = 0.368), and high internal consistency (α > 0.85) were observed. BI-RADS 0, 1, and 2 reports were accurately explained, while BI-RADS 3-6 reports were challenging.
Conclusion: The model demonstrated reliability and clarity, offering promise for patients with diverse backgrounds. LLMs like ChatGPT-4o could simplify breast radiology reports, aid in communication, and enhance patient care.
Relevance statement: Simplified breast radiology reports generated by ChatGPT-4o show potential in enhancing communication with patients, improving comprehension across varying educational backgrounds, and contributing to patient-centered care in radiology practice.
Key points: AI simplifies complex breast imaging reports, enhancing patient understanding. Simplified reports from AI maintain accuracy, improving patient comprehension significantly. Implementing AI reports enhances patient engagement and communication in breast imaging.