Joseph E Nassar,Michael J Farias,Lama A Ammar,Rhea Rasquinha,Andrew Y Xu,Manjot Singh,Daniel Alsoof,Bassel G Diebo,Alan H Daniels
{"title":"弥合脊柱护理中的健康素养差距:使用chatgpt - 40改进患者教育材料。","authors":"Joseph E Nassar,Michael J Farias,Lama A Ammar,Rhea Rasquinha,Andrew Y Xu,Manjot Singh,Daniel Alsoof,Bassel G Diebo,Alan H Daniels","doi":"10.2106/jbjs.24.01484","DOIUrl":null,"url":null,"abstract":"BACKGROUND\r\nPatient-education materials (PEMs) are essential to improve health literacy, engagement, and treatment adherence, yet many exceed the recommended readability levels. Therefore, individuals with limited health literacy are at a disadvantage. This study evaluated the readability of spine-related PEMs from the American Academy of Orthopaedic Surgeons (AAOS), the North American Spine Society (NASS), and the American Association of Neurological Surgeons (AANS), and examined the potential of artificial intelligence (AI) in optimizing PEMs for improved patient comprehension.\r\n\r\nMETHODS\r\nA total of 146 spine-related PEMs from the AAOS, NASS, and AANS websites were analyzed. Readability was assessed using the Flesch-Kincaid Grade Level (FKGL) and Simple Measure of Gobbledygook (SMOG) Index scores, as well as other metrics, including language complexity and use of the passive voice. ChatGPT-4o was used to revise the PEMs to a sixth-grade reading level, and post-revision readability was assessed. Test-retest reliability was evaluated, and paired t tests were used to compare the readability scores of the original and AI-modified PEMs.\r\n\r\nRESULTS\r\nThe original PEMs had a mean FKGL of 10.2 ± 2.6, which significantly exceeded both the recommended sixth-grade reading level and the average U.S. eighth-grade reading level (p < 0.05). ChatGPT-4o generated articles with a significantly reduced mean FKGL of 6.6 ± 1.3 (p < 0.05). ChatGPT-4o also improved other readability metrics, including the SMOG Index score, language complexity, and use of the passive voice, while maintaining accuracy and adequate detail. Excellent test-retest reliability was observed across all of the metrics (intraclass correlation coefficient [ICC] range, 0.91 to 0.98).\r\n\r\nCONCLUSIONS\r\nSpine-related PEMs from the AAOS, the NASS, and the AANS remain excessively complex, despite minor improvements to readability over the years. ChatGPT-4o demonstrated the potential to enhance PEM readability while maintaining content quality. Future efforts should integrate AI tools with visual aids and user-friendly platforms to create inclusive and comprehensible PEMs to address diverse patient needs and improve health-care delivery.","PeriodicalId":22625,"journal":{"name":"The Journal of Bone & Joint Surgery","volume":"37 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bridging Health Literacy Gaps in Spine Care: Using ChatGPT-4o to Improve Patient-Education Materials.\",\"authors\":\"Joseph E Nassar,Michael J Farias,Lama A Ammar,Rhea Rasquinha,Andrew Y Xu,Manjot Singh,Daniel Alsoof,Bassel G Diebo,Alan H Daniels\",\"doi\":\"10.2106/jbjs.24.01484\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"BACKGROUND\\r\\nPatient-education materials (PEMs) are essential to improve health literacy, engagement, and treatment adherence, yet many exceed the recommended readability levels. Therefore, individuals with limited health literacy are at a disadvantage. This study evaluated the readability of spine-related PEMs from the American Academy of Orthopaedic Surgeons (AAOS), the North American Spine Society (NASS), and the American Association of Neurological Surgeons (AANS), and examined the potential of artificial intelligence (AI) in optimizing PEMs for improved patient comprehension.\\r\\n\\r\\nMETHODS\\r\\nA total of 146 spine-related PEMs from the AAOS, NASS, and AANS websites were analyzed. Readability was assessed using the Flesch-Kincaid Grade Level (FKGL) and Simple Measure of Gobbledygook (SMOG) Index scores, as well as other metrics, including language complexity and use of the passive voice. ChatGPT-4o was used to revise the PEMs to a sixth-grade reading level, and post-revision readability was assessed. Test-retest reliability was evaluated, and paired t tests were used to compare the readability scores of the original and AI-modified PEMs.\\r\\n\\r\\nRESULTS\\r\\nThe original PEMs had a mean FKGL of 10.2 ± 2.6, which significantly exceeded both the recommended sixth-grade reading level and the average U.S. eighth-grade reading level (p < 0.05). ChatGPT-4o generated articles with a significantly reduced mean FKGL of 6.6 ± 1.3 (p < 0.05). ChatGPT-4o also improved other readability metrics, including the SMOG Index score, language complexity, and use of the passive voice, while maintaining accuracy and adequate detail. Excellent test-retest reliability was observed across all of the metrics (intraclass correlation coefficient [ICC] range, 0.91 to 0.98).\\r\\n\\r\\nCONCLUSIONS\\r\\nSpine-related PEMs from the AAOS, the NASS, and the AANS remain excessively complex, despite minor improvements to readability over the years. ChatGPT-4o demonstrated the potential to enhance PEM readability while maintaining content quality. Future efforts should integrate AI tools with visual aids and user-friendly platforms to create inclusive and comprehensible PEMs to address diverse patient needs and improve health-care delivery.\",\"PeriodicalId\":22625,\"journal\":{\"name\":\"The Journal of Bone & Joint Surgery\",\"volume\":\"37 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Journal of Bone & Joint Surgery\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2106/jbjs.24.01484\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Bone & Joint Surgery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2106/jbjs.24.01484","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Bridging Health Literacy Gaps in Spine Care: Using ChatGPT-4o to Improve Patient-Education Materials.
BACKGROUND
Patient-education materials (PEMs) are essential to improve health literacy, engagement, and treatment adherence, yet many exceed the recommended readability levels. Therefore, individuals with limited health literacy are at a disadvantage. This study evaluated the readability of spine-related PEMs from the American Academy of Orthopaedic Surgeons (AAOS), the North American Spine Society (NASS), and the American Association of Neurological Surgeons (AANS), and examined the potential of artificial intelligence (AI) in optimizing PEMs for improved patient comprehension.
METHODS
A total of 146 spine-related PEMs from the AAOS, NASS, and AANS websites were analyzed. Readability was assessed using the Flesch-Kincaid Grade Level (FKGL) and Simple Measure of Gobbledygook (SMOG) Index scores, as well as other metrics, including language complexity and use of the passive voice. ChatGPT-4o was used to revise the PEMs to a sixth-grade reading level, and post-revision readability was assessed. Test-retest reliability was evaluated, and paired t tests were used to compare the readability scores of the original and AI-modified PEMs.
RESULTS
The original PEMs had a mean FKGL of 10.2 ± 2.6, which significantly exceeded both the recommended sixth-grade reading level and the average U.S. eighth-grade reading level (p < 0.05). ChatGPT-4o generated articles with a significantly reduced mean FKGL of 6.6 ± 1.3 (p < 0.05). ChatGPT-4o also improved other readability metrics, including the SMOG Index score, language complexity, and use of the passive voice, while maintaining accuracy and adequate detail. Excellent test-retest reliability was observed across all of the metrics (intraclass correlation coefficient [ICC] range, 0.91 to 0.98).
CONCLUSIONS
Spine-related PEMs from the AAOS, the NASS, and the AANS remain excessively complex, despite minor improvements to readability over the years. ChatGPT-4o demonstrated the potential to enhance PEM readability while maintaining content quality. Future efforts should integrate AI tools with visual aids and user-friendly platforms to create inclusive and comprehensible PEMs to address diverse patient needs and improve health-care delivery.