人工与人工智能生成的关节成形术文献:对感知交流、质量和作者来源的单盲分析。

IF 2.3 3区 医学 Q2 SURGERY
Kyle W. Lawrence, Akram A. Habibi, Spencer A. Ward, Claudette M. Lajam, Ran Schwarzkopf, Joshua C. Rozell
{"title":"人工与人工智能生成的关节成形术文献:对感知交流、质量和作者来源的单盲分析。","authors":"Kyle W. Lawrence,&nbsp;Akram A. Habibi,&nbsp;Spencer A. Ward,&nbsp;Claudette M. Lajam,&nbsp;Ran Schwarzkopf,&nbsp;Joshua C. Rozell","doi":"10.1002/rcs.2621","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Large language models (LLM) have unknown implications for medical research. This study assessed whether LLM-generated abstracts are distinguishable from human-written abstracts and to compare their perceived quality.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>The LLM ChatGPT was used to generate 20 arthroplasty abstracts (AI-generated) based on full-text manuscripts, which were compared to originally published abstracts (human-written). Six blinded orthopaedic surgeons rated abstracts on overall quality, communication, and confidence in the authorship source. Authorship-confidence scores were compared to a test value representing complete inability to discern authorship.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>Modestly increased confidence in human authorship was observed for human-written abstracts compared with AI-generated abstracts (<i>p</i> = 0.028), though AI-generated abstract authorship-confidence scores were statistically consistent with inability to discern authorship (<i>p</i> = 0.999). Overall abstract quality was higher for human-written abstracts (<i>p</i> = 0.019).</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>AI-generated abstracts' absolute authorship-confidence ratings demonstrated difficulty in discerning authorship but did not achieve the perceived quality of human-written abstracts. Caution is warranted in implementing LLMs into scientific writing.</p>\n </section>\n </div>","PeriodicalId":50311,"journal":{"name":"International Journal of Medical Robotics and Computer Assisted Surgery","volume":"20 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Human versus artificial intelligence-generated arthroplasty literature: A single-blinded analysis of perceived communication, quality, and authorship source\",\"authors\":\"Kyle W. Lawrence,&nbsp;Akram A. Habibi,&nbsp;Spencer A. Ward,&nbsp;Claudette M. Lajam,&nbsp;Ran Schwarzkopf,&nbsp;Joshua C. Rozell\",\"doi\":\"10.1002/rcs.2621\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>Large language models (LLM) have unknown implications for medical research. This study assessed whether LLM-generated abstracts are distinguishable from human-written abstracts and to compare their perceived quality.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>The LLM ChatGPT was used to generate 20 arthroplasty abstracts (AI-generated) based on full-text manuscripts, which were compared to originally published abstracts (human-written). Six blinded orthopaedic surgeons rated abstracts on overall quality, communication, and confidence in the authorship source. Authorship-confidence scores were compared to a test value representing complete inability to discern authorship.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>Modestly increased confidence in human authorship was observed for human-written abstracts compared with AI-generated abstracts (<i>p</i> = 0.028), though AI-generated abstract authorship-confidence scores were statistically consistent with inability to discern authorship (<i>p</i> = 0.999). Overall abstract quality was higher for human-written abstracts (<i>p</i> = 0.019).</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3>\\n \\n <p>AI-generated abstracts' absolute authorship-confidence ratings demonstrated difficulty in discerning authorship but did not achieve the perceived quality of human-written abstracts. Caution is warranted in implementing LLMs into scientific writing.</p>\\n </section>\\n </div>\",\"PeriodicalId\":50311,\"journal\":{\"name\":\"International Journal of Medical Robotics and Computer Assisted Surgery\",\"volume\":\"20 1\",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-02-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Medical Robotics and Computer Assisted Surgery\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/rcs.2621\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"SURGERY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Robotics and Computer Assisted Surgery","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/rcs.2621","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0

摘要

背景:大语言模型(LLM)对医学研究的影响尚不可知。本研究评估了 LLM 生成的摘要是否能与人工撰写的摘要区分开来,并比较了它们的感知质量:方法:使用 LLM ChatGPT 根据全文手稿生成 20 篇关节成形术摘要(人工智能生成),并将其与最初发表的摘要(人工撰写)进行比较。六位双盲骨科外科医生对摘要的整体质量、沟通性和作者来源的可信度进行评分。作者信心得分与代表完全无法辨别作者的测试值进行比较:结果:与人工智能生成的摘要相比,人工撰写的摘要对人类作者的信任度略有提高(p = 0.028),但人工智能生成的摘要作者信任度得分与无法辨别作者身份在统计学上是一致的(p = 0.999)。人工撰写的摘要总体质量更高(p = 0.019):结论:人工智能生成的摘要的绝对作者身份置信度评分表明难以辨别作者身份,但没有达到人工撰写摘要的感知质量。在科学写作中使用 LLMs 时需要谨慎。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Human versus artificial intelligence-generated arthroplasty literature: A single-blinded analysis of perceived communication, quality, and authorship source

Background

Large language models (LLM) have unknown implications for medical research. This study assessed whether LLM-generated abstracts are distinguishable from human-written abstracts and to compare their perceived quality.

Methods

The LLM ChatGPT was used to generate 20 arthroplasty abstracts (AI-generated) based on full-text manuscripts, which were compared to originally published abstracts (human-written). Six blinded orthopaedic surgeons rated abstracts on overall quality, communication, and confidence in the authorship source. Authorship-confidence scores were compared to a test value representing complete inability to discern authorship.

Results

Modestly increased confidence in human authorship was observed for human-written abstracts compared with AI-generated abstracts (p = 0.028), though AI-generated abstract authorship-confidence scores were statistically consistent with inability to discern authorship (p = 0.999). Overall abstract quality was higher for human-written abstracts (p = 0.019).

Conclusions

AI-generated abstracts' absolute authorship-confidence ratings demonstrated difficulty in discerning authorship but did not achieve the perceived quality of human-written abstracts. Caution is warranted in implementing LLMs into scientific writing.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.50
自引率
12.00%
发文量
131
审稿时长
6-12 weeks
期刊介绍: The International Journal of Medical Robotics and Computer Assisted Surgery provides a cross-disciplinary platform for presenting the latest developments in robotics and computer assisted technologies for medical applications. The journal publishes cutting-edge papers and expert reviews, complemented by commentaries, correspondence and conference highlights that stimulate discussion and exchange of ideas. Areas of interest include robotic surgery aids and systems, operative planning tools, medical imaging and visualisation, simulation and navigation, virtual reality, intuitive command and control systems, haptics and sensor technologies. In addition to research and surgical planning studies, the journal welcomes papers detailing clinical trials and applications of computer-assisted workflows and robotic systems in neurosurgery, urology, paediatric, orthopaedic, craniofacial, cardiovascular, thoraco-abdominal, musculoskeletal and visceral surgery. Articles providing critical analysis of clinical trials, assessment of the benefits and risks of the application of these technologies, commenting on ease of use, or addressing surgical education and training issues are also encouraged. The journal aims to foster a community that encompasses medical practitioners, researchers, and engineers and computer scientists developing robotic systems and computational tools in academic and commercial environments, with the intention of promoting and developing these exciting areas of medical technology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信