Necati Bahadir Eravsar MD , Mahmud Aydin MD , Atahan Eryilmaz MD , Cihangir Turemis MD , Serkan Surucu MD , Andrew E. Jimenez MD
{"title":"Is ChatGPT a more academic source than google searches for patient questions about hip arthroscopy? An analysis of the most frequently asked questions","authors":"Necati Bahadir Eravsar MD , Mahmud Aydin MD , Atahan Eryilmaz MD , Cihangir Turemis MD , Serkan Surucu MD , Andrew E. Jimenez MD","doi":"10.1016/j.jisako.2025.100892","DOIUrl":null,"url":null,"abstract":"<div><h3>Objectives</h3><div>The purpose of this study was to compare the reliability and accuracy of responses provided to patients about hip arthroscopy (HA) by Chat Generative Pre-Trained Transformer (ChatGPT), an artificial intelligence (AI) and large language model (LLM) online program, with those obtained through a contemporary Google Search for frequently asked questions (FAQs) regarding HA.</div></div><div><h3>Methods</h3><div>“HA” was entered into Google Search and ChatGPT, and the 15 most common FAQs and the answers were determined. In Google Search, the FAQs were obtained from the “People also ask” section. ChatGPT was queried to provide the 15 most common FAQs and subsequent answers. The Rothwell system groups the questions under 10 subheadings. Responses of ChatGPT and Google Search engines were compared.</div></div><div><h3>Results</h3><div>Timeline of recovery (23.3%) and technical details (20%) were the most common categories of questions. ChatGPT produced significantly more data in the technical details category (33.3% vs. 6.6%; <em>p</em>-value = 0.0455) than in the other categories. The most FAQs were academic in nature for both Google web search (46.6%) and ChatGPT (93.3%). ChatGPT provided significantly more academic references than Google web searches (93.3% vs. 46.6%). Conversely, Google web search cited more medical practice references (20% vs. 0%), single surgeon websites (26% vs. 0%), and government websites (6% vs. 0%) more frequently than ChatGPT.</div></div><div><h3>Conclusion</h3><div>ChatGPT performed similarly to Google searches for information about HA. Compared to Google, ChatGPT provided significantly more academic sources for its answers to patient questions.</div></div><div><h3>Level of evidence</h3><div>Level IV.</div></div>","PeriodicalId":36847,"journal":{"name":"Journal of ISAKOS Joint Disorders & Orthopaedic Sports Medicine","volume":"12 ","pages":"Article 100892"},"PeriodicalIF":2.7000,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of ISAKOS Joint Disorders & Orthopaedic Sports Medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2059775425005097","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives
The purpose of this study was to compare the reliability and accuracy of responses provided to patients about hip arthroscopy (HA) by Chat Generative Pre-Trained Transformer (ChatGPT), an artificial intelligence (AI) and large language model (LLM) online program, with those obtained through a contemporary Google Search for frequently asked questions (FAQs) regarding HA.
Methods
“HA” was entered into Google Search and ChatGPT, and the 15 most common FAQs and the answers were determined. In Google Search, the FAQs were obtained from the “People also ask” section. ChatGPT was queried to provide the 15 most common FAQs and subsequent answers. The Rothwell system groups the questions under 10 subheadings. Responses of ChatGPT and Google Search engines were compared.
Results
Timeline of recovery (23.3%) and technical details (20%) were the most common categories of questions. ChatGPT produced significantly more data in the technical details category (33.3% vs. 6.6%; p-value = 0.0455) than in the other categories. The most FAQs were academic in nature for both Google web search (46.6%) and ChatGPT (93.3%). ChatGPT provided significantly more academic references than Google web searches (93.3% vs. 46.6%). Conversely, Google web search cited more medical practice references (20% vs. 0%), single surgeon websites (26% vs. 0%), and government websites (6% vs. 0%) more frequently than ChatGPT.
Conclusion
ChatGPT performed similarly to Google searches for information about HA. Compared to Google, ChatGPT provided significantly more academic sources for its answers to patient questions.
目的:本研究的目的是比较人工智能(AI)和大型语言模型(LLM)在线程序聊天生成预训练转换器(ChatGPT)向患者提供的有关髋关节镜检查的回答的可靠性和准确性,以及通过当代谷歌搜索有关髋关节镜检查的常见问题(FAQs)获得的回答。方法:在谷歌搜索和ChatGPT中输入“髋关节镜”(HA),确定15个最常见的常见问题及其答案。在b谷歌Search中,常见问题从“People also ask”部分获得。对ChatGPT进行了查询,以提供15个最常见的常见问题及其后续答案。Rothwell系统将问题分为10个小标题:ChatGPT和谷歌搜索引擎的响应进行了比较。结果:恢复时间(23.3%)和技术细节(20%)是最常见的问题类别。ChatGPT在技术细节类别中产生的数据明显更多(33.3% vs. 6.6%;p值= 0.0455)高于其他类别。b谷歌网络搜索(46.6%)和ChatGPT(93.3%)中最常被问到的问题都是学术性的。ChatGPT提供的学术参考文献明显多于b谷歌(93.3% vs. 46.6%)。相反,b谷歌网站搜索比ChatGPT更频繁地引用医疗实践参考(20%比0%),单个外科医生网站(26%比0%)和政府网站(6%比0%)。结论:ChatGPT对髋关节镜信息的搜索与谷歌相似。与b谷歌相比,ChatGPT为其患者问题的答案提供了更多的学术来源。证据等级:四级。