Yining Hua, Winna Xia, David W. Bates, Luke Hartstein, Hyungjin Tom Kim, Michael Lingzhi Li, Benjamin W Nelson, Charles Stromeyer, Darlene King, Jina Suh, Li Zhou, John Torous
{"title":"Standardizing and Scaffolding Healthcare AI-Chatbot Evaluation","authors":"Yining Hua, Winna Xia, David W. Bates, Luke Hartstein, Hyungjin Tom Kim, Michael Lingzhi Li, Benjamin W Nelson, Charles Stromeyer, Darlene King, Jina Suh, Li Zhou, John Torous","doi":"10.1101/2024.07.21.24310774","DOIUrl":null,"url":null,"abstract":"The rapid rise of healthcare chatbots, valued at $787.1 million in 2022 and projected to grow at 23.9% annually through 2030, underscores the need for robust evaluation frameworks. Despite their potential, the absence of standardized evaluation criteria and rapid AI advancements complicate assessments. This study addresses these challenges by developing a the first comprehensive evaluation framework inspired by health app regulations and integrating insights from diverse stakeholders. Following PRISMA guidelines, we reviewed 11 existing frameworks, refining 271 questions into a structured framework encompassing three priority constructs, 18 second-level constructs, and 60 third-level constructs. Our framework emphasizes safety, privacy, trustworthiness, and usefulness, aligning with recent concerns about AI in healthcare. This adaptable framework aims to serve as the initial step in facilitating the responsible integration of chatbots into healthcare settings.","PeriodicalId":501386,"journal":{"name":"medRxiv - Health Policy","volume":"2013 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Health Policy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.07.21.24310774","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The rapid rise of healthcare chatbots, valued at $787.1 million in 2022 and projected to grow at 23.9% annually through 2030, underscores the need for robust evaluation frameworks. Despite their potential, the absence of standardized evaluation criteria and rapid AI advancements complicate assessments. This study addresses these challenges by developing a the first comprehensive evaluation framework inspired by health app regulations and integrating insights from diverse stakeholders. Following PRISMA guidelines, we reviewed 11 existing frameworks, refining 271 questions into a structured framework encompassing three priority constructs, 18 second-level constructs, and 60 third-level constructs. Our framework emphasizes safety, privacy, trustworthiness, and usefulness, aligning with recent concerns about AI in healthcare. This adaptable framework aims to serve as the initial step in facilitating the responsible integration of chatbots into healthcare settings.