Adarsh Suresh, Jacob Siahaan, Rex Aw Marco, Eric Klineberg, Timothy Borden, Rohini Vanodia, Lindsay Crawford, Shah-Nawaz Dodwad, Shiraz Younas, Surya Mundluru
{"title":"Comparing the effectiveness of generative AI technology in commonly asked scoliosis questions.","authors":"Adarsh Suresh, Jacob Siahaan, Rex Aw Marco, Eric Klineberg, Timothy Borden, Rohini Vanodia, Lindsay Crawford, Shah-Nawaz Dodwad, Shiraz Younas, Surya Mundluru","doi":"10.1177/18632521251359098","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>In recent years, generative artificial intelligence systems have transformed the landscape of patient's access to medical information and education. As increases in general and subspeciality physician shortages lead to longer lead times for patients to get access to physicians, we aim to understand how effectively different AI platforms can respond to questions asked by parents about both operative and nonoperative scoliosis.</p><p><strong>Methods: </strong>A survey comprised of 31 questions, among the most commonly asked, regarding scoliosis with responses from ChatGPT, Google Gemini, and Microsoft Copilot was administered to board-certified Orthopedic surgeons, fellowship trained in either pediatric or spine surgery. (four reviewers). They evaluated each output from Likert Scale of 1-5 with 5 meaning an excellent response was given and 1 meaning a poor response was given. Pairwise comparisons were used for analysis.</p><p><strong>Results: </strong>All three generative AI technologies performed well with an overall mean rating of 3.4 which is between good and very good on the Likert Scale provided. ChatGPT performed the best out of the three, with a mean rating of 4.0, Google Gemini was second best with a mean rating of 3.1, and Copilot was third best with a mean rating of 3.1. ChatGPT compared with Gemini and Copilot revealed statistically significant differences with a p-value <0.001, with no statistical difference between Gemini and Copilot.</p><p><strong>Conclusion: </strong>In response to common scoliosis questions asked by parents, ChatGPT, Microsoft Copilot, and Google Gemini, were scored highly by our Spine team and has important indications for use in the future.</p>","PeriodicalId":56060,"journal":{"name":"Journal of Childrens Orthopaedics","volume":" ","pages":"416-421"},"PeriodicalIF":1.6000,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12301223/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Childrens Orthopaedics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/18632521251359098","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/10/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"ORTHOPEDICS","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: In recent years, generative artificial intelligence systems have transformed the landscape of patient's access to medical information and education. As increases in general and subspeciality physician shortages lead to longer lead times for patients to get access to physicians, we aim to understand how effectively different AI platforms can respond to questions asked by parents about both operative and nonoperative scoliosis.
Methods: A survey comprised of 31 questions, among the most commonly asked, regarding scoliosis with responses from ChatGPT, Google Gemini, and Microsoft Copilot was administered to board-certified Orthopedic surgeons, fellowship trained in either pediatric or spine surgery. (four reviewers). They evaluated each output from Likert Scale of 1-5 with 5 meaning an excellent response was given and 1 meaning a poor response was given. Pairwise comparisons were used for analysis.
Results: All three generative AI technologies performed well with an overall mean rating of 3.4 which is between good and very good on the Likert Scale provided. ChatGPT performed the best out of the three, with a mean rating of 4.0, Google Gemini was second best with a mean rating of 3.1, and Copilot was third best with a mean rating of 3.1. ChatGPT compared with Gemini and Copilot revealed statistically significant differences with a p-value <0.001, with no statistical difference between Gemini and Copilot.
Conclusion: In response to common scoliosis questions asked by parents, ChatGPT, Microsoft Copilot, and Google Gemini, were scored highly by our Spine team and has important indications for use in the future.
期刊介绍:
Aims & Scope
The Journal of Children’s Orthopaedics is the official journal of the European Paediatric Orthopaedic Society (EPOS) and is published by The British Editorial Society of Bone & Joint Surgery.
It provides a forum for the advancement of the knowledge and education in paediatric orthopaedics and traumatology across geographical borders. It advocates an increased worldwide involvement in preventing and treating musculoskeletal diseases in children and adolescents.
The journal publishes high quality, peer-reviewed articles that focus on clinical practice, diagnosis and treatment of disorders unique to paediatric orthopaedics, as well as on basic and applied research. It aims to help physicians stay abreast of the latest and ever-changing developments in the field of paediatric orthopaedics and traumatology.
The journal welcomes original contributions submitted exclusively for review to the journal. This continuously published online journal is fully open access and will publish one print issue each year to coincide with the EPOS Annual Congress, featuring the meeting’s abstracts.