基于AUA指南的必应聊天肾结石管理问题:聊天机器人会话风格模式的比较。

IF 2.8 2区 医学 Q2 UROLOGY & NEPHROLOGY
Daniel R Hanna, Michael L Creswell, Russell S Terry, Lucas B Vergamini, Mihaela Sardiu, Holly E Du, Amber K McMahon, Wilson R Molina, Bristol B Whiles
{"title":"基于AUA指南的必应聊天肾结石管理问题:聊天机器人会话风格模式的比较。","authors":"Daniel R Hanna, Michael L Creswell, Russell S Terry, Lucas B Vergamini, Mihaela Sardiu, Holly E Du, Amber K McMahon, Wilson R Molina, Bristol B Whiles","doi":"10.1007/s00345-025-05533-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Artificial intelligence (AI) technology will inevitably permeate healthcare. Bing Chat is an AI chatbot with different conservation styles. We evaluated each of these response mode answers regarding management of nephrolithiasis.</p><p><strong>Methods: </strong>A total of 20 questions were created based on the AUA Surgical Management of Stones guidelines. Bing Chat's responses were evaluated across Precise, Balanced, and Creative conversation style chat modes by three physicians using the Brief DISCERN tool. Consensus scoring was employed to assess appropriateness, guideline adherence, empathy, recommendation for physician consultation, and inability to answer the inquiry. Responses were also assessed for their directness and the presence of superfluous information. Chat modes were compared using descriptive statistics as well as ANOVA, Chi-Squared tests, and Fisher exact tests.</p><p><strong>Results: </strong>The median Brief DISCERN Score in Precise, Balanced, and Creative modes were: 22, 21, and 21, respectively. There was no significant difference in Brief DISCERN scores between the three chat modes (p = 0.68). Guideline adherence by chatbot conversation style was similar (p = 0.37), as was response appropriateness (p = 0.62), directly answering the question asked (p = 0.26) and providing a recommendation to consult with a healthcare provider (p = 0.07). Creative and balanced modes outperformed precise mode when evaluating response empathy. Creative mode was more likely to include superfluous information and less likely to answer the question.</p><p><strong>Conclusion: </strong>In its current iteration, Bing Chat provides low quality urologic healthcare information for nephrolithiasis queries, regardless of the conversation style utilized.</p>","PeriodicalId":23954,"journal":{"name":"World Journal of Urology","volume":"43 1","pages":"151"},"PeriodicalIF":2.8000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bing chat for kidney stone management questions based on the AUA guidelines: a comparison of chatbot conversation style modes.\",\"authors\":\"Daniel R Hanna, Michael L Creswell, Russell S Terry, Lucas B Vergamini, Mihaela Sardiu, Holly E Du, Amber K McMahon, Wilson R Molina, Bristol B Whiles\",\"doi\":\"10.1007/s00345-025-05533-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Artificial intelligence (AI) technology will inevitably permeate healthcare. Bing Chat is an AI chatbot with different conservation styles. We evaluated each of these response mode answers regarding management of nephrolithiasis.</p><p><strong>Methods: </strong>A total of 20 questions were created based on the AUA Surgical Management of Stones guidelines. Bing Chat's responses were evaluated across Precise, Balanced, and Creative conversation style chat modes by three physicians using the Brief DISCERN tool. Consensus scoring was employed to assess appropriateness, guideline adherence, empathy, recommendation for physician consultation, and inability to answer the inquiry. Responses were also assessed for their directness and the presence of superfluous information. Chat modes were compared using descriptive statistics as well as ANOVA, Chi-Squared tests, and Fisher exact tests.</p><p><strong>Results: </strong>The median Brief DISCERN Score in Precise, Balanced, and Creative modes were: 22, 21, and 21, respectively. There was no significant difference in Brief DISCERN scores between the three chat modes (p = 0.68). Guideline adherence by chatbot conversation style was similar (p = 0.37), as was response appropriateness (p = 0.62), directly answering the question asked (p = 0.26) and providing a recommendation to consult with a healthcare provider (p = 0.07). Creative and balanced modes outperformed precise mode when evaluating response empathy. Creative mode was more likely to include superfluous information and less likely to answer the question.</p><p><strong>Conclusion: </strong>In its current iteration, Bing Chat provides low quality urologic healthcare information for nephrolithiasis queries, regardless of the conversation style utilized.</p>\",\"PeriodicalId\":23954,\"journal\":{\"name\":\"World Journal of Urology\",\"volume\":\"43 1\",\"pages\":\"151\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"World Journal of Urology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s00345-025-05533-4\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"UROLOGY & NEPHROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Journal of Urology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00345-025-05533-4","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"UROLOGY & NEPHROLOGY","Score":null,"Total":0}
引用次数: 0

摘要

目的:人工智能(AI)技术将不可避免地渗透到医疗保健领域。必应聊天是一个具有不同保护风格的人工智能聊天机器人。我们评估了关于肾结石管理的每一种反应模式答案。方法:根据美国泌尿外科学会结石手术处理指南编制共20个问题。三位医生使用Brief DISCERN工具对Bing Chat的回答进行了精确、平衡和创造性对话风格聊天模式的评估。共识评分用于评估适当性、指南依从性、移情、医生咨询建议和无法回答询问。还评估了回答的直接性和多余信息的存在。使用描述性统计、方差分析、卡方检验和Fisher精确检验对聊天模式进行比较。结果:在精确、平衡和创造性模式下,Brief DISCERN得分的中位数分别为:22、21和21。三种聊天方式Brief DISCERN得分差异无统计学意义(p = 0.68)。聊天机器人会话方式对指南的依从性相似(p = 0.37),响应适当性相似(p = 0.62),直接回答所提出的问题(p = 0.26),并建议咨询医疗保健提供者(p = 0.07)。在评估反应共情时,创造性和平衡模式优于精确模式。创造性模式更有可能包含多余的信息,而不太可能回答问题。结论:在当前的版本中,无论使用何种对话方式,必应聊天都为肾结石查询提供了低质量的泌尿系统医疗保健信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Bing chat for kidney stone management questions based on the AUA guidelines: a comparison of chatbot conversation style modes.

Purpose: Artificial intelligence (AI) technology will inevitably permeate healthcare. Bing Chat is an AI chatbot with different conservation styles. We evaluated each of these response mode answers regarding management of nephrolithiasis.

Methods: A total of 20 questions were created based on the AUA Surgical Management of Stones guidelines. Bing Chat's responses were evaluated across Precise, Balanced, and Creative conversation style chat modes by three physicians using the Brief DISCERN tool. Consensus scoring was employed to assess appropriateness, guideline adherence, empathy, recommendation for physician consultation, and inability to answer the inquiry. Responses were also assessed for their directness and the presence of superfluous information. Chat modes were compared using descriptive statistics as well as ANOVA, Chi-Squared tests, and Fisher exact tests.

Results: The median Brief DISCERN Score in Precise, Balanced, and Creative modes were: 22, 21, and 21, respectively. There was no significant difference in Brief DISCERN scores between the three chat modes (p = 0.68). Guideline adherence by chatbot conversation style was similar (p = 0.37), as was response appropriateness (p = 0.62), directly answering the question asked (p = 0.26) and providing a recommendation to consult with a healthcare provider (p = 0.07). Creative and balanced modes outperformed precise mode when evaluating response empathy. Creative mode was more likely to include superfluous information and less likely to answer the question.

Conclusion: In its current iteration, Bing Chat provides low quality urologic healthcare information for nephrolithiasis queries, regardless of the conversation style utilized.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
World Journal of Urology
World Journal of Urology 医学-泌尿学与肾脏学
CiteScore
6.80
自引率
8.80%
发文量
317
审稿时长
4-8 weeks
期刊介绍: The WORLD JOURNAL OF UROLOGY conveys regularly the essential results of urological research and their practical and clinical relevance to a broad audience of urologists in research and clinical practice. In order to guarantee a balanced program, articles are published to reflect the developments in all fields of urology on an internationally advanced level. Each issue treats a main topic in review articles of invited international experts. Free papers are unrelated articles to the main topic.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信