Matthew X Luo, Adam Lyle, Phillip Bennett, Daniel Albertson, Deepika Sirohi, Benjamin L Maughan, Valarie McMurtry, Jonathon Mahlow
{"title":"人工智能聊天机器人与病理学教师和住院医师:来自泌尿生殖系统治疗计划会议的真实世界临床问题。","authors":"Matthew X Luo, Adam Lyle, Phillip Bennett, Daniel Albertson, Deepika Sirohi, Benjamin L Maughan, Valarie McMurtry, Jonathon Mahlow","doi":"10.1093/ajcp/aqae078","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Artificial intelligence (AI)-based chatbots have demonstrated accuracy in a variety of fields, including medicine, but research has yet to substantiate their accuracy and clinical relevance. We evaluated an AI chatbot's answers to questions posed during a treatment planning conference.</p><p><strong>Methods: </strong>Pathology residents, pathology faculty, and an AI chatbot (OpenAI ChatGPT [January 30, 2023, release]) answered a questionnaire curated from a genitourinary subspecialty treatment planning conference. Results were evaluated by 2 blinded adjudicators: a clinician expert and a pathology expert. Scores were based on accuracy and clinical relevance.</p><p><strong>Results: </strong>Overall, faculty scored highest (4.75), followed by the AI chatbot (4.10), research-prepared residents (3.50), and unprepared residents (2.87). The AI chatbot scored statistically significantly better than unprepared residents (P = .03) but not statistically significantly different from research-prepared residents (P = .33) or faculty (P = .30). Residents did not statistically significantly improve after research (P = .39), and faculty performed statistically significantly better than both resident categories (unprepared, P < .01; research prepared, P = .01).</p><p><strong>Conclusions: </strong>The AI chatbot gave answers to medical questions that were comparable in accuracy and clinical relevance to pathology faculty, suggesting promise for further development. Serious concerns remain, however, that without the ability to provide support with references, AI will face legitimate scrutiny as to how it can be integrated into medical decision-making.</p>","PeriodicalId":7506,"journal":{"name":"American journal of clinical pathology","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Artificial intelligence chatbot vs pathology faculty and residents: Real-world clinical questions from a genitourinary treatment planning conference.\",\"authors\":\"Matthew X Luo, Adam Lyle, Phillip Bennett, Daniel Albertson, Deepika Sirohi, Benjamin L Maughan, Valarie McMurtry, Jonathon Mahlow\",\"doi\":\"10.1093/ajcp/aqae078\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>Artificial intelligence (AI)-based chatbots have demonstrated accuracy in a variety of fields, including medicine, but research has yet to substantiate their accuracy and clinical relevance. We evaluated an AI chatbot's answers to questions posed during a treatment planning conference.</p><p><strong>Methods: </strong>Pathology residents, pathology faculty, and an AI chatbot (OpenAI ChatGPT [January 30, 2023, release]) answered a questionnaire curated from a genitourinary subspecialty treatment planning conference. Results were evaluated by 2 blinded adjudicators: a clinician expert and a pathology expert. Scores were based on accuracy and clinical relevance.</p><p><strong>Results: </strong>Overall, faculty scored highest (4.75), followed by the AI chatbot (4.10), research-prepared residents (3.50), and unprepared residents (2.87). The AI chatbot scored statistically significantly better than unprepared residents (P = .03) but not statistically significantly different from research-prepared residents (P = .33) or faculty (P = .30). Residents did not statistically significantly improve after research (P = .39), and faculty performed statistically significantly better than both resident categories (unprepared, P < .01; research prepared, P = .01).</p><p><strong>Conclusions: </strong>The AI chatbot gave answers to medical questions that were comparable in accuracy and clinical relevance to pathology faculty, suggesting promise for further development. Serious concerns remain, however, that without the ability to provide support with references, AI will face legitimate scrutiny as to how it can be integrated into medical decision-making.</p>\",\"PeriodicalId\":7506,\"journal\":{\"name\":\"American journal of clinical pathology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-06-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American journal of clinical pathology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1093/ajcp/aqae078\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of clinical pathology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/ajcp/aqae078","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PATHOLOGY","Score":null,"Total":0}
Artificial intelligence chatbot vs pathology faculty and residents: Real-world clinical questions from a genitourinary treatment planning conference.
Objectives: Artificial intelligence (AI)-based chatbots have demonstrated accuracy in a variety of fields, including medicine, but research has yet to substantiate their accuracy and clinical relevance. We evaluated an AI chatbot's answers to questions posed during a treatment planning conference.
Methods: Pathology residents, pathology faculty, and an AI chatbot (OpenAI ChatGPT [January 30, 2023, release]) answered a questionnaire curated from a genitourinary subspecialty treatment planning conference. Results were evaluated by 2 blinded adjudicators: a clinician expert and a pathology expert. Scores were based on accuracy and clinical relevance.
Results: Overall, faculty scored highest (4.75), followed by the AI chatbot (4.10), research-prepared residents (3.50), and unprepared residents (2.87). The AI chatbot scored statistically significantly better than unprepared residents (P = .03) but not statistically significantly different from research-prepared residents (P = .33) or faculty (P = .30). Residents did not statistically significantly improve after research (P = .39), and faculty performed statistically significantly better than both resident categories (unprepared, P < .01; research prepared, P = .01).
Conclusions: The AI chatbot gave answers to medical questions that were comparable in accuracy and clinical relevance to pathology faculty, suggesting promise for further development. Serious concerns remain, however, that without the ability to provide support with references, AI will face legitimate scrutiny as to how it can be integrated into medical decision-making.
期刊介绍:
The American Journal of Clinical Pathology (AJCP) is the official journal of the American Society for Clinical Pathology and the Academy of Clinical Laboratory Physicians and Scientists. It is a leading international journal for publication of articles concerning novel anatomic pathology and laboratory medicine observations on human disease. AJCP emphasizes articles that focus on the application of evolving technologies for the diagnosis and characterization of diseases and conditions, as well as those that have a direct link toward improving patient care.