Emile B. Gordon MD , Charles M. Maxfield MD , Robert French MD, MBA , Laura J. Fish MD , Jacob Romm MD , Emily Barre MPH, BSE , Erica Kinne MD , Ryan Peterson MD , Lars J. Grimm MD, MHS
{"title":"在放射科住院医师申请中使用大型语言模型:不受欢迎但不可避免。","authors":"Emile B. Gordon MD , Charles M. Maxfield MD , Robert French MD, MBA , Laura J. Fish MD , Jacob Romm MD , Emily Barre MPH, BSE , Erica Kinne MD , Ryan Peterson MD , Lars J. Grimm MD, MHS","doi":"10.1016/j.jacr.2024.08.027","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>This study explores radiology program directors’ perspectives on the impact of large language model (LLM) use among residency applicants to craft personal statements.</div></div><div><h3>Methods</h3><div>Eight program directors from the Radiology Residency Education Research Alliance participated in a mixed-methods study, which included a survey regarding impressions of artificial intelligence (AI)-generated personal statements and focus group discussions (July 2023). Each director reviewed four personal statement variations for five applicants, anonymized to author type: the original and three Chat Generative Pre-trained Transformer-4.0 (GPT) versions generated with varying prompts, aggregated for analysis. A 5-point Likert scale surveyed the writing quality, including voice, clarity, engagement, organization, and perceived origin of each statement. An experienced qualitative researcher facilitated focus group discussions. Data analysis was performed using a rapid analytic approach with a coding template capturing key areas related to residency applications.</div></div><div><h3>Results</h3><div>GPT-generated statement ratings were more often average or worse in quality (56%, 268 of 475) than ratings of human-authored statements (29%, 45 of 160). Although reviewers were not confident in their ability to distinguish the origin of personal statements, they did so reliably and consistently, identifying the human-authored personal statements at 95% (38 of 40) as probably or definitely original. Focus group discussions highlighted the inevitable use of AI in crafting personal statements and concerns about its impact on the authenticity and the value of the personal statement in residency selections. Program directors were divided on the appropriate use and regulation of AI.</div></div><div><h3>Discussion</h3><div>Radiology residency program directors rated LLM-generated personal statements as lower in quality and expressed concern about the loss of the applicant’s voice but acknowledged the inevitability of increased AI use in the generation of application statements.</div></div>","PeriodicalId":49044,"journal":{"name":"Journal of the American College of Radiology","volume":"22 1","pages":"Pages 33-40"},"PeriodicalIF":4.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Large Language Model Use in Radiology Residency Applications: Unwelcomed but Inevitable\",\"authors\":\"Emile B. Gordon MD , Charles M. Maxfield MD , Robert French MD, MBA , Laura J. Fish MD , Jacob Romm MD , Emily Barre MPH, BSE , Erica Kinne MD , Ryan Peterson MD , Lars J. Grimm MD, MHS\",\"doi\":\"10.1016/j.jacr.2024.08.027\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objective</h3><div>This study explores radiology program directors’ perspectives on the impact of large language model (LLM) use among residency applicants to craft personal statements.</div></div><div><h3>Methods</h3><div>Eight program directors from the Radiology Residency Education Research Alliance participated in a mixed-methods study, which included a survey regarding impressions of artificial intelligence (AI)-generated personal statements and focus group discussions (July 2023). Each director reviewed four personal statement variations for five applicants, anonymized to author type: the original and three Chat Generative Pre-trained Transformer-4.0 (GPT) versions generated with varying prompts, aggregated for analysis. A 5-point Likert scale surveyed the writing quality, including voice, clarity, engagement, organization, and perceived origin of each statement. An experienced qualitative researcher facilitated focus group discussions. Data analysis was performed using a rapid analytic approach with a coding template capturing key areas related to residency applications.</div></div><div><h3>Results</h3><div>GPT-generated statement ratings were more often average or worse in quality (56%, 268 of 475) than ratings of human-authored statements (29%, 45 of 160). Although reviewers were not confident in their ability to distinguish the origin of personal statements, they did so reliably and consistently, identifying the human-authored personal statements at 95% (38 of 40) as probably or definitely original. Focus group discussions highlighted the inevitable use of AI in crafting personal statements and concerns about its impact on the authenticity and the value of the personal statement in residency selections. Program directors were divided on the appropriate use and regulation of AI.</div></div><div><h3>Discussion</h3><div>Radiology residency program directors rated LLM-generated personal statements as lower in quality and expressed concern about the loss of the applicant’s voice but acknowledged the inevitability of increased AI use in the generation of application statements.</div></div>\",\"PeriodicalId\":49044,\"journal\":{\"name\":\"Journal of the American College of Radiology\",\"volume\":\"22 1\",\"pages\":\"Pages 33-40\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the American College of Radiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1546144024007683\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American College of Radiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1546144024007683","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
Large Language Model Use in Radiology Residency Applications: Unwelcomed but Inevitable
Objective
This study explores radiology program directors’ perspectives on the impact of large language model (LLM) use among residency applicants to craft personal statements.
Methods
Eight program directors from the Radiology Residency Education Research Alliance participated in a mixed-methods study, which included a survey regarding impressions of artificial intelligence (AI)-generated personal statements and focus group discussions (July 2023). Each director reviewed four personal statement variations for five applicants, anonymized to author type: the original and three Chat Generative Pre-trained Transformer-4.0 (GPT) versions generated with varying prompts, aggregated for analysis. A 5-point Likert scale surveyed the writing quality, including voice, clarity, engagement, organization, and perceived origin of each statement. An experienced qualitative researcher facilitated focus group discussions. Data analysis was performed using a rapid analytic approach with a coding template capturing key areas related to residency applications.
Results
GPT-generated statement ratings were more often average or worse in quality (56%, 268 of 475) than ratings of human-authored statements (29%, 45 of 160). Although reviewers were not confident in their ability to distinguish the origin of personal statements, they did so reliably and consistently, identifying the human-authored personal statements at 95% (38 of 40) as probably or definitely original. Focus group discussions highlighted the inevitable use of AI in crafting personal statements and concerns about its impact on the authenticity and the value of the personal statement in residency selections. Program directors were divided on the appropriate use and regulation of AI.
Discussion
Radiology residency program directors rated LLM-generated personal statements as lower in quality and expressed concern about the loss of the applicant’s voice but acknowledged the inevitability of increased AI use in the generation of application statements.
期刊介绍:
The official journal of the American College of Radiology, JACR informs its readers of timely, pertinent, and important topics affecting the practice of diagnostic radiologists, interventional radiologists, medical physicists, and radiation oncologists. In so doing, JACR improves their practices and helps optimize their role in the health care system. By providing a forum for informative, well-written articles on health policy, clinical practice, practice management, data science, and education, JACR engages readers in a dialogue that ultimately benefits patient care.