应用人工智能对随机对照试验的半自动化可信度评估:一个案例研究。

IF 7.3 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES
Ling Shan Au , Lizhen Qu , Jeremy Nielsen , Zongyuan Ge , Lyle C. Gurrin , Ben W. Mol , Rui Wang
{"title":"应用人工智能对随机对照试验的半自动化可信度评估:一个案例研究。","authors":"Ling Shan Au ,&nbsp;Lizhen Qu ,&nbsp;Jeremy Nielsen ,&nbsp;Zongyuan Ge ,&nbsp;Lyle C. Gurrin ,&nbsp;Ben W. Mol ,&nbsp;Rui Wang","doi":"10.1016/j.jclinepi.2025.111672","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective</h3><div>Randomized controlled trials (RCTs) are the cornerstone of evidence-based medicine. Unfortunately, not all RCTs are based on real data. This serious breach of research integrity compromises the reliability of systematic reviews and meta-analyses, leading to misinformed clinical guidelines and posing a risk to both individual and public health. While methods to detect problematic RCTs have been proposed, they are time-consuming and labor-intensive. The use of artificial intelligence large language models (LLMs) has the potential to accelerate the data collection needed to assess the trustworthiness of published RCTs.</div></div><div><h3>Methods</h3><div>We present a case study using ChatGPT powered by OpenAI's GPT-4o to assess an RCT paper. The case study focuses on applying the trustworthiness in randomised controlled trials (TRACT checklist) and automating data table extraction to accelerate statistical analysis targeting the trustworthiness of the data. We provide a detailed step-by-step outline of the process, along with considerations for potential improvements.</div></div><div><h3>Results</h3><div>ChatGPT completed all tasks by processing the PDF of the selected publication and responding to specific prompts. ChatGPT addressed items in the TRACT checklist effectively, demonstrating an ability to provide precise “yes” or “no” answers while quickly synthesizing information from both the paper and relevant online resources. A comparison of results generated by ChatGPT and the human assessor showed an 84% level of agreement of (16/19) TRACT items. This substantially accelerated the qualitative assessment process. Additionally, ChatGPT was able to extract efficiently the data tables as Microsoft Excel worksheets and reorganize the data, with three out of four extracted tables achieving an accuracy score of 100%, facilitating subsequent analysis and data verification.</div></div><div><h3>Conclusion</h3><div>ChatGPT demonstrates potential in semiautomating the trustworthiness assessment of RCTs, though in our experience this required repeated prompting from the user. Further testing and refinement will involve applying ChatGPT to collections of RCT papers to improve the accuracy of data capture and lessen the role of the user. The ultimate aim is a completely automated process for large volumes of papers that seems plausible given our initial experience.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"180 ","pages":"Article 111672"},"PeriodicalIF":7.3000,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using artificial intelligence to semi-automate trustworthiness assessment of randomized controlled trials: a case study\",\"authors\":\"Ling Shan Au ,&nbsp;Lizhen Qu ,&nbsp;Jeremy Nielsen ,&nbsp;Zongyuan Ge ,&nbsp;Lyle C. Gurrin ,&nbsp;Ben W. Mol ,&nbsp;Rui Wang\",\"doi\":\"10.1016/j.jclinepi.2025.111672\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background and Objective</h3><div>Randomized controlled trials (RCTs) are the cornerstone of evidence-based medicine. Unfortunately, not all RCTs are based on real data. This serious breach of research integrity compromises the reliability of systematic reviews and meta-analyses, leading to misinformed clinical guidelines and posing a risk to both individual and public health. While methods to detect problematic RCTs have been proposed, they are time-consuming and labor-intensive. The use of artificial intelligence large language models (LLMs) has the potential to accelerate the data collection needed to assess the trustworthiness of published RCTs.</div></div><div><h3>Methods</h3><div>We present a case study using ChatGPT powered by OpenAI's GPT-4o to assess an RCT paper. The case study focuses on applying the trustworthiness in randomised controlled trials (TRACT checklist) and automating data table extraction to accelerate statistical analysis targeting the trustworthiness of the data. We provide a detailed step-by-step outline of the process, along with considerations for potential improvements.</div></div><div><h3>Results</h3><div>ChatGPT completed all tasks by processing the PDF of the selected publication and responding to specific prompts. ChatGPT addressed items in the TRACT checklist effectively, demonstrating an ability to provide precise “yes” or “no” answers while quickly synthesizing information from both the paper and relevant online resources. A comparison of results generated by ChatGPT and the human assessor showed an 84% level of agreement of (16/19) TRACT items. This substantially accelerated the qualitative assessment process. Additionally, ChatGPT was able to extract efficiently the data tables as Microsoft Excel worksheets and reorganize the data, with three out of four extracted tables achieving an accuracy score of 100%, facilitating subsequent analysis and data verification.</div></div><div><h3>Conclusion</h3><div>ChatGPT demonstrates potential in semiautomating the trustworthiness assessment of RCTs, though in our experience this required repeated prompting from the user. Further testing and refinement will involve applying ChatGPT to collections of RCT papers to improve the accuracy of data capture and lessen the role of the user. The ultimate aim is a completely automated process for large volumes of papers that seems plausible given our initial experience.</div></div>\",\"PeriodicalId\":51079,\"journal\":{\"name\":\"Journal of Clinical Epidemiology\",\"volume\":\"180 \",\"pages\":\"Article 111672\"},\"PeriodicalIF\":7.3000,\"publicationDate\":\"2025-01-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Clinical Epidemiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0895435625000058\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0895435625000058","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

摘要

目的:随机对照试验(RCTs)是循证医学的基石。不幸的是,并非所有随机对照试验都基于真实数据。这种对研究完整性的严重破坏损害了系统评价和荟萃分析的可靠性,导致错误的临床指南,并对个人和公众健康构成风险。虽然已经提出了检测有问题的随机对照试验的方法,但这些方法既耗时又费力。人工智能大语言模型(LLM)的使用有可能加速评估已发表随机对照试验可信度所需的数据收集。方法:我们提出了一个案例研究,使用OpenAI的gpt - 40支持的ChatGPT来评估一篇随机对照试验论文。案例研究的重点是应用TRACT清单和自动化数据表提取来加速针对数据可信度的统计分析。我们提供了该过程的详细分步大纲,以及对潜在改进的考虑。结果:ChatGPT通过处理选定出版物的PDF并响应特定提示完成了所有任务。ChatGPT有效地处理了TRACT清单中的项目,展示了提供精确的“是”或“否”答案的能力,同时快速地综合了来自纸张和相关在线资源的信息。ChatGPT和人工评估产生的结果的比较显示,(16/19)个TRACT项目的一致性水平为84%。这大大加快了定性评估进程。此外,ChatGPT能够有效地将数据表提取为Microsoft Excel工作表,并对数据进行重组,提取的4个表中有3个表的准确率达到100%,便于后续的分析和数据验证。结论:ChatGPT显示了半自动化随机对照试验可信度评估的潜力,尽管在我们的经验中,这需要用户反复提示。进一步的测试和改进将涉及将ChatGPT应用于RCT论文集合,以提高数据捕获的准确性并减少用户的作用。最终目标是对大量论文进行完全自动化处理,根据我们的初步经验,这似乎是可行的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Using artificial intelligence to semi-automate trustworthiness assessment of randomized controlled trials: a case study

Background and Objective

Randomized controlled trials (RCTs) are the cornerstone of evidence-based medicine. Unfortunately, not all RCTs are based on real data. This serious breach of research integrity compromises the reliability of systematic reviews and meta-analyses, leading to misinformed clinical guidelines and posing a risk to both individual and public health. While methods to detect problematic RCTs have been proposed, they are time-consuming and labor-intensive. The use of artificial intelligence large language models (LLMs) has the potential to accelerate the data collection needed to assess the trustworthiness of published RCTs.

Methods

We present a case study using ChatGPT powered by OpenAI's GPT-4o to assess an RCT paper. The case study focuses on applying the trustworthiness in randomised controlled trials (TRACT checklist) and automating data table extraction to accelerate statistical analysis targeting the trustworthiness of the data. We provide a detailed step-by-step outline of the process, along with considerations for potential improvements.

Results

ChatGPT completed all tasks by processing the PDF of the selected publication and responding to specific prompts. ChatGPT addressed items in the TRACT checklist effectively, demonstrating an ability to provide precise “yes” or “no” answers while quickly synthesizing information from both the paper and relevant online resources. A comparison of results generated by ChatGPT and the human assessor showed an 84% level of agreement of (16/19) TRACT items. This substantially accelerated the qualitative assessment process. Additionally, ChatGPT was able to extract efficiently the data tables as Microsoft Excel worksheets and reorganize the data, with three out of four extracted tables achieving an accuracy score of 100%, facilitating subsequent analysis and data verification.

Conclusion

ChatGPT demonstrates potential in semiautomating the trustworthiness assessment of RCTs, though in our experience this required repeated prompting from the user. Further testing and refinement will involve applying ChatGPT to collections of RCT papers to improve the accuracy of data capture and lessen the role of the user. The ultimate aim is a completely automated process for large volumes of papers that seems plausible given our initial experience.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Clinical Epidemiology
Journal of Clinical Epidemiology 医学-公共卫生、环境卫生与职业卫生
CiteScore
12.00
自引率
6.90%
发文量
320
审稿时长
44 days
期刊介绍: The Journal of Clinical Epidemiology strives to enhance the quality of clinical and patient-oriented healthcare research by advancing and applying innovative methods in conducting, presenting, synthesizing, disseminating, and translating research results into optimal clinical practice. Special emphasis is placed on training new generations of scientists and clinical practice leaders.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信