Timothy Olivier , Zilin Ma , Ankit Patel , Weibin Shi , Mohammed Murtuza , Nicole E. Hatchard , Xiaoyu Norman Pan , Thiru M. Annaswamy
{"title":"Assessing ChatGPT responses to patient questions on epidural steroid injections: A comparative study of general vs specific queries","authors":"Timothy Olivier , Zilin Ma , Ankit Patel , Weibin Shi , Mohammed Murtuza , Nicole E. Hatchard , Xiaoyu Norman Pan , Thiru M. Annaswamy","doi":"10.1016/j.inpm.2025.100592","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Artificial intelligence (AI) is becoming more integrated into healthcare, with large language models (LLMs) like ChatGPT being widely used by patients to answer medical questions. Given the increasing reliance on AI for health-related information, it's important to evaluate how well these models perform in addressing common patient concerns, especially in procedural medicine. To date, no studies have specifically examined AI's role in addressing patient questions related to epidural steroid injections (ESIs), making this an important area for investigation.</div></div><div><h3>Objective</h3><div>This study examines ChatGPT's ability to answer patient questions about epidural steroid injections (ESIs), focusing on response accuracy, readability, and overall usefulness. Our aim was to evaluate and compare the content, accuracy, and user-friendliness of AI-generated information on common peri-procedural questions and complications associated with ESIs, thereby extending the application of AI as a triage tool into pain management and interventional spine procedures.</div></div><div><h3>Methods</h3><div>We formulated and compiled 29 common patient questions about ESIs and tested ChatGPT's responses in both general and specific formats. Two interventional pain specialists reviewed the AI-generated answers, assessing them for accuracy, clarity, empathy, and directness using a Likert scale. Readability scores were calculated using Flesch-Kincaid Reading Level and Flesch Reading Ease scales. Statistical analyses were performed to compare general versus specific responses.</div></div><div><h3>Results</h3><div>General queries led to longer, more detailed responses, but readability was similar between general and specific formats. Subjective analysis showed that general responses were rated higher for accuracy, clarity, and responsiveness. However, neither format demonstrated strong empathy, and some general queries resulted in off-topic responses, underscoring the importance of precise wording when interacting with AI.</div></div><div><h3>Conclusion</h3><div>ChatGPT can provide clear and largely accurate answers to patient questions about ESIs, with general prompts often producing more complete responses. However, AI-generated content still has limitations, particularly in conveying empathy and avoiding tangential information. These findings highlight the need for thoughtful prompt design and further research into how AI can be integrated into clinical workflows while ensuring accuracy and patient safety.</div></div>","PeriodicalId":100727,"journal":{"name":"Interventional Pain Medicine","volume":"4 2","pages":"Article 100592"},"PeriodicalIF":0.0000,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Interventional Pain Medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772594425000536","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Artificial intelligence (AI) is becoming more integrated into healthcare, with large language models (LLMs) like ChatGPT being widely used by patients to answer medical questions. Given the increasing reliance on AI for health-related information, it's important to evaluate how well these models perform in addressing common patient concerns, especially in procedural medicine. To date, no studies have specifically examined AI's role in addressing patient questions related to epidural steroid injections (ESIs), making this an important area for investigation.
Objective
This study examines ChatGPT's ability to answer patient questions about epidural steroid injections (ESIs), focusing on response accuracy, readability, and overall usefulness. Our aim was to evaluate and compare the content, accuracy, and user-friendliness of AI-generated information on common peri-procedural questions and complications associated with ESIs, thereby extending the application of AI as a triage tool into pain management and interventional spine procedures.
Methods
We formulated and compiled 29 common patient questions about ESIs and tested ChatGPT's responses in both general and specific formats. Two interventional pain specialists reviewed the AI-generated answers, assessing them for accuracy, clarity, empathy, and directness using a Likert scale. Readability scores were calculated using Flesch-Kincaid Reading Level and Flesch Reading Ease scales. Statistical analyses were performed to compare general versus specific responses.
Results
General queries led to longer, more detailed responses, but readability was similar between general and specific formats. Subjective analysis showed that general responses were rated higher for accuracy, clarity, and responsiveness. However, neither format demonstrated strong empathy, and some general queries resulted in off-topic responses, underscoring the importance of precise wording when interacting with AI.
Conclusion
ChatGPT can provide clear and largely accurate answers to patient questions about ESIs, with general prompts often producing more complete responses. However, AI-generated content still has limitations, particularly in conveying empathy and avoiding tangential information. These findings highlight the need for thoughtful prompt design and further research into how AI can be integrated into clinical workflows while ensuring accuracy and patient safety.