Prem Patel MD , Allison Bigeh DO , Benjamin Romer MD , Shantanu Dev BS , Samar Binkheder PhD , Lang Li PhD , Weidan Cao PhD , M. Wesley Milks MD
{"title":"chatgpt在回答患者预防性心脏病学问题中的质量和可读性:一项初步研究","authors":"Prem Patel MD , Allison Bigeh DO , Benjamin Romer MD , Shantanu Dev BS , Samar Binkheder PhD , Lang Li PhD , Weidan Cao PhD , M. Wesley Milks MD","doi":"10.1016/j.ajpc.2025.101170","DOIUrl":null,"url":null,"abstract":"<div><h3>Therapeutic Area</h3><div>Other</div></div><div><h3>Background</h3><div>As artificial intelligence (AI) becomes increasingly integrated into healthcare, ChatGPT has emerged as a promising tool for patient education. However, research on its suitability for preventive cardiology remains limited. With patients increasingly relying on online health information, it is essential that content is both scientifically accurate and accessible to individuals of all health literacy levels. This study evaluates the quality and readability of ChatGPT’s responses to common questions on lifestyle modification, women’s cardiovascular health, and cholesterol management.</div></div><div><h3>Methods</h3><div>Twenty-six questions (8 on lifestyle modifications, 8 on women’s cardiovascular health, and 10 on cholesterol management) were queried using the GPT-4 model. Responses were independently evaluated by three board-certified preventive cardiologists, referencing the latest national cardiovascular guidelines. Quality was assessed using a 5-point Likert scale for correctness, comprehensiveness, conciseness, and comprehensibility, previously employed in medical AI research. Readability was analyzed using the Flesch-Kincaid Grade Level and other standardized readability metrics.</div></div><div><h3>Results</h3><div>ChatGPT provided adequate responses to 88.4% (23/26) of questions, with mean (SE) scores of 3.71 ± 0.20 for correctness, 4.06 ± 0.14 for conciseness, 4.06 ± 0.13 for comprehensiveness, and 4.40 ± 0.10 for comprehensibility. The highest-scoring topic was lifestyle modification (84.4%), followed by cholesterol management (81.2%) and women’s cardiovascular health (77.8%). Among inadequate responses, key limitations included overstating the risks of low LDL cholesterol and exaggerating the benefits of estrogen replacement therapy (ERT) for postmenopausal CVD risk reduction. ChatGPT also provided unsupported recommendations regarding dietary supplements for CVD prevention. Readability analysis revealed responses at a 13th-grade level, exceeding the recommended 6th-grade level for patient education.</div></div><div><h3>Conclusions</h3><div>ChatGPT’s responses were generally suitable for topics such as heart-healthy diets, exercise, weight management, epidemiology and clinical presentation of CVD in women, postmenopausal CVD risk, cholesterol-lowering therapy, statin-associated side effects, and Lp(a) risk stratification. However, inaccuracies persisted in dietary supplementation, ERT, and very low LDL levels. Enhancements in AI training are needed to improve accuracy in these areas. Additionally, the high readability level limits accessibility for the general public, underscoring the need for optimization to ensure clear and reliable patient education.</div></div>","PeriodicalId":72173,"journal":{"name":"American journal of preventive cardiology","volume":"23 ","pages":"Article 101170"},"PeriodicalIF":5.9000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"QUALITY AND READABILITY OF CHATGPT IN ANSWERING PATIENTS’ PREVENTIVE CARDIOLOGY QUESTIONS: A PILOT STUDY\",\"authors\":\"Prem Patel MD , Allison Bigeh DO , Benjamin Romer MD , Shantanu Dev BS , Samar Binkheder PhD , Lang Li PhD , Weidan Cao PhD , M. Wesley Milks MD\",\"doi\":\"10.1016/j.ajpc.2025.101170\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Therapeutic Area</h3><div>Other</div></div><div><h3>Background</h3><div>As artificial intelligence (AI) becomes increasingly integrated into healthcare, ChatGPT has emerged as a promising tool for patient education. However, research on its suitability for preventive cardiology remains limited. With patients increasingly relying on online health information, it is essential that content is both scientifically accurate and accessible to individuals of all health literacy levels. This study evaluates the quality and readability of ChatGPT’s responses to common questions on lifestyle modification, women’s cardiovascular health, and cholesterol management.</div></div><div><h3>Methods</h3><div>Twenty-six questions (8 on lifestyle modifications, 8 on women’s cardiovascular health, and 10 on cholesterol management) were queried using the GPT-4 model. Responses were independently evaluated by three board-certified preventive cardiologists, referencing the latest national cardiovascular guidelines. Quality was assessed using a 5-point Likert scale for correctness, comprehensiveness, conciseness, and comprehensibility, previously employed in medical AI research. Readability was analyzed using the Flesch-Kincaid Grade Level and other standardized readability metrics.</div></div><div><h3>Results</h3><div>ChatGPT provided adequate responses to 88.4% (23/26) of questions, with mean (SE) scores of 3.71 ± 0.20 for correctness, 4.06 ± 0.14 for conciseness, 4.06 ± 0.13 for comprehensiveness, and 4.40 ± 0.10 for comprehensibility. The highest-scoring topic was lifestyle modification (84.4%), followed by cholesterol management (81.2%) and women’s cardiovascular health (77.8%). Among inadequate responses, key limitations included overstating the risks of low LDL cholesterol and exaggerating the benefits of estrogen replacement therapy (ERT) for postmenopausal CVD risk reduction. ChatGPT also provided unsupported recommendations regarding dietary supplements for CVD prevention. Readability analysis revealed responses at a 13th-grade level, exceeding the recommended 6th-grade level for patient education.</div></div><div><h3>Conclusions</h3><div>ChatGPT’s responses were generally suitable for topics such as heart-healthy diets, exercise, weight management, epidemiology and clinical presentation of CVD in women, postmenopausal CVD risk, cholesterol-lowering therapy, statin-associated side effects, and Lp(a) risk stratification. However, inaccuracies persisted in dietary supplementation, ERT, and very low LDL levels. Enhancements in AI training are needed to improve accuracy in these areas. Additionally, the high readability level limits accessibility for the general public, underscoring the need for optimization to ensure clear and reliable patient education.</div></div>\",\"PeriodicalId\":72173,\"journal\":{\"name\":\"American journal of preventive cardiology\",\"volume\":\"23 \",\"pages\":\"Article 101170\"},\"PeriodicalIF\":5.9000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American journal of preventive cardiology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666667725002454\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CARDIAC & CARDIOVASCULAR SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of preventive cardiology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666667725002454","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
QUALITY AND READABILITY OF CHATGPT IN ANSWERING PATIENTS’ PREVENTIVE CARDIOLOGY QUESTIONS: A PILOT STUDY
Therapeutic Area
Other
Background
As artificial intelligence (AI) becomes increasingly integrated into healthcare, ChatGPT has emerged as a promising tool for patient education. However, research on its suitability for preventive cardiology remains limited. With patients increasingly relying on online health information, it is essential that content is both scientifically accurate and accessible to individuals of all health literacy levels. This study evaluates the quality and readability of ChatGPT’s responses to common questions on lifestyle modification, women’s cardiovascular health, and cholesterol management.
Methods
Twenty-six questions (8 on lifestyle modifications, 8 on women’s cardiovascular health, and 10 on cholesterol management) were queried using the GPT-4 model. Responses were independently evaluated by three board-certified preventive cardiologists, referencing the latest national cardiovascular guidelines. Quality was assessed using a 5-point Likert scale for correctness, comprehensiveness, conciseness, and comprehensibility, previously employed in medical AI research. Readability was analyzed using the Flesch-Kincaid Grade Level and other standardized readability metrics.
Results
ChatGPT provided adequate responses to 88.4% (23/26) of questions, with mean (SE) scores of 3.71 ± 0.20 for correctness, 4.06 ± 0.14 for conciseness, 4.06 ± 0.13 for comprehensiveness, and 4.40 ± 0.10 for comprehensibility. The highest-scoring topic was lifestyle modification (84.4%), followed by cholesterol management (81.2%) and women’s cardiovascular health (77.8%). Among inadequate responses, key limitations included overstating the risks of low LDL cholesterol and exaggerating the benefits of estrogen replacement therapy (ERT) for postmenopausal CVD risk reduction. ChatGPT also provided unsupported recommendations regarding dietary supplements for CVD prevention. Readability analysis revealed responses at a 13th-grade level, exceeding the recommended 6th-grade level for patient education.
Conclusions
ChatGPT’s responses were generally suitable for topics such as heart-healthy diets, exercise, weight management, epidemiology and clinical presentation of CVD in women, postmenopausal CVD risk, cholesterol-lowering therapy, statin-associated side effects, and Lp(a) risk stratification. However, inaccuracies persisted in dietary supplementation, ERT, and very low LDL levels. Enhancements in AI training are needed to improve accuracy in these areas. Additionally, the high readability level limits accessibility for the general public, underscoring the need for optimization to ensure clear and reliable patient education.