Can Artificial Intelligence Deceive Residency Committees? A Randomized Multicenter Analysis of Letters of Recommendation.

IF 2.6 2区 医学 Q1 ORTHOPEDICS
Samuel K Simister, Eric G Huish, Eugene Y Tsai, Hai V Le, Andrea Halim, Dominick Tuason, John P Meehan, Holly B Leshikar, Augustine M Saiz, Zachary C Lum
{"title":"Can Artificial Intelligence Deceive Residency Committees? A Randomized Multicenter Analysis of Letters of Recommendation.","authors":"Samuel K Simister, Eric G Huish, Eugene Y Tsai, Hai V Le, Andrea Halim, Dominick Tuason, John P Meehan, Holly B Leshikar, Augustine M Saiz, Zachary C Lum","doi":"10.5435/JAAOS-D-24-00438","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>The introduction of generative artificial intelligence (AI) may have a profound effect on residency applications. In this study, we explore the abilities of AI-generated letters of recommendation (LORs) by evaluating the accuracy of orthopaedic surgery residency selection committee members to identify LORs written by human or AI authors.</p><p><strong>Methods: </strong>In a multicenter, single-blind trial, a total of 45 LORs (15 human, 15 ChatGPT, and 15 Google BARD) were curated. In a random fashion, seven faculty reviewers from four residency programs were asked to grade each of the 45 LORs based on the 11 characteristics outlined in the American Orthopaedic Associations standardized LOR, as well as a 1 to 10 scale on how they would rank the applicant, their desire of having the applicant in the program, and if they thought the letter was generated by a human or AI author. Analysis included descriptives, ordinal regression, and a receiver operator characteristic curve to compare accuracy based on the number of letters reviewed.</p><p><strong>Results: </strong>Faculty reviewers correctly identified 40% (42/105) of human-generated and 63% (132/210) of AI-generated letters (P < 0.001), which did not increase over time (AUC 0.451, P = 0.102). When analyzed by perceived author, letters marked as human generated had significantly higher means for all variables (P = 0.01). BARD did markedly better than human authors in accuracy (3.25 [1.79 to 5.92], P < 0.001), adaptability (1.29 [1.02 to 1.65], P = 0.034), and perceived commitment (1.56 [0.99 to 2.47], P < 0.055). Additional analysis controlling for reviewer background showed no differences in outcomes based on experience or familiarity with the AI programs.</p><p><strong>Conclusion: </strong>Faculty members were unsuccessful in determining the difference between human-generated and AI-generated LORs 50% of the time, which suggests that AI can generate LORs similarly to human authors. This highlights the importance for selection committees to reconsider the role and influence of LORs on residency applications.</p>","PeriodicalId":51098,"journal":{"name":"Journal of the American Academy of Orthopaedic Surgeons","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Academy of Orthopaedic Surgeons","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.5435/JAAOS-D-24-00438","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: The introduction of generative artificial intelligence (AI) may have a profound effect on residency applications. In this study, we explore the abilities of AI-generated letters of recommendation (LORs) by evaluating the accuracy of orthopaedic surgery residency selection committee members to identify LORs written by human or AI authors.

Methods: In a multicenter, single-blind trial, a total of 45 LORs (15 human, 15 ChatGPT, and 15 Google BARD) were curated. In a random fashion, seven faculty reviewers from four residency programs were asked to grade each of the 45 LORs based on the 11 characteristics outlined in the American Orthopaedic Associations standardized LOR, as well as a 1 to 10 scale on how they would rank the applicant, their desire of having the applicant in the program, and if they thought the letter was generated by a human or AI author. Analysis included descriptives, ordinal regression, and a receiver operator characteristic curve to compare accuracy based on the number of letters reviewed.

Results: Faculty reviewers correctly identified 40% (42/105) of human-generated and 63% (132/210) of AI-generated letters (P < 0.001), which did not increase over time (AUC 0.451, P = 0.102). When analyzed by perceived author, letters marked as human generated had significantly higher means for all variables (P = 0.01). BARD did markedly better than human authors in accuracy (3.25 [1.79 to 5.92], P < 0.001), adaptability (1.29 [1.02 to 1.65], P = 0.034), and perceived commitment (1.56 [0.99 to 2.47], P < 0.055). Additional analysis controlling for reviewer background showed no differences in outcomes based on experience or familiarity with the AI programs.

Conclusion: Faculty members were unsuccessful in determining the difference between human-generated and AI-generated LORs 50% of the time, which suggests that AI can generate LORs similarly to human authors. This highlights the importance for selection committees to reconsider the role and influence of LORs on residency applications.

求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
6.10
自引率
6.20%
发文量
529
审稿时长
4-8 weeks
期刊介绍: The Journal of the American Academy of Orthopaedic Surgeons was established in the fall of 1993 by the Academy in response to its membership’s demand for a clinical review journal. Two issues were published the first year, followed by six issues yearly from 1994 through 2004. In September 2005, JAAOS began publishing monthly issues. Each issue includes richly illustrated peer-reviewed articles focused on clinical diagnosis and management. Special features in each issue provide commentary on developments in pharmacotherapeutics, materials and techniques, and computer applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信