Leveraging AI to Drive Timely Improvements in Patient Experience Feedback: Algorithm Validation.

IF 3.8 3区医学 Q2 MEDICAL INFORMATICS

JMIR Medical Informatics Pub Date : 2025-07-10 DOI:10.2196/60900

Mustafa Khanbhai, Catalina Carenzo, Sarindi Aryasinghe, David Manton, Erik Mayer

{"title":"Leveraging AI to Drive Timely Improvements in Patient Experience Feedback: Algorithm Validation.","authors":"Mustafa Khanbhai, Catalina Carenzo, Sarindi Aryasinghe, David Manton, Erik Mayer","doi":"10.2196/60900","DOIUrl":null,"url":null,"abstract":"Background: Understanding and improving patient care is pivotal for health care providers. With increasing volumes of the Friends and Family Test (FFT) data in England, manual analysis of this patient feedback poses challenges for many health care organizations. This underscores the importance of automated text analysis, particularly in predicting sentiments and themes in real time.Objective: Leveraging machine learning and natural language processing, this study explores the utility of a supervised algorithm to systematically test and refine the algorithm's cross-contextual performance in diverse health care settings, addressing variations in population characteristics, geographical locations, and care settings, ultimately driving improvements based on patient feedback.Methods: The text analytics algorithm initially developed in a large acute trust in London was further tested in 9 health care organizations with diverse care settings across England. These trusts varied in technical capacity and resource, population demographics, and FFT free text datasets. Testing and validation of the algorithm were performed, including manual coding of a subset of retrospective comments. Technical infrastructure, including coding environments and packages for algorithm testing and deployment, was optimized. The algorithm was iteratively trained using bag of words from anonymized data, tailored to accommodate contextual variations, and tested for change in algorithm performance while simultaneously rectifying issues identified.Results: The algorithm demonstrated satisfactory overall accuracy (>75%) in predicting themes and sentiments embedded within free-text responses across a variety of care settings and population demographics. While the algorithm yielded strong and reusable models in relatively stable environments, such as adult inpatient care settings, the initial accuracy was notably lower in organizations providing services such as pediatrics and mental health. However, the accuracy of our algorithm significantly improved when individual trust coding templates were applied. Thematic saturation was reached after the fifth organization was recruited, and no further coding was required for the last 4 organizations. Subsequently, a framework and pipeline for deployment of the algorithm were developed to provide a standardized approach for implementation and analysis of FFT free text, ensuring ease of use.Conclusions: This study represents a significant step forward in leveraging free-text FFT data for valuable insights in diverse health care settings through the testing and development of a robust supervised learning text analytics algorithm. The disparity in some care settings was anticipated, given that the lexicon and phraseology used was inherently different from those prevalent in adult inpatient care (where the algorithm was developed). However, these challenges were addressed with further coding and testing. This approach enhanced the accuracy and reliability of the algorithm, encouraged inter- and intraorganizational collaboration, and shared learning.","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e60900"},"PeriodicalIF":3.8000,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12270031/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/60900","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Understanding and improving patient care is pivotal for health care providers. With increasing volumes of the Friends and Family Test (FFT) data in England, manual analysis of this patient feedback poses challenges for many health care organizations. This underscores the importance of automated text analysis, particularly in predicting sentiments and themes in real time.

Objective: Leveraging machine learning and natural language processing, this study explores the utility of a supervised algorithm to systematically test and refine the algorithm's cross-contextual performance in diverse health care settings, addressing variations in population characteristics, geographical locations, and care settings, ultimately driving improvements based on patient feedback.

Methods: The text analytics algorithm initially developed in a large acute trust in London was further tested in 9 health care organizations with diverse care settings across England. These trusts varied in technical capacity and resource, population demographics, and FFT free text datasets. Testing and validation of the algorithm were performed, including manual coding of a subset of retrospective comments. Technical infrastructure, including coding environments and packages for algorithm testing and deployment, was optimized. The algorithm was iteratively trained using bag of words from anonymized data, tailored to accommodate contextual variations, and tested for change in algorithm performance while simultaneously rectifying issues identified.

Results: The algorithm demonstrated satisfactory overall accuracy (>75%) in predicting themes and sentiments embedded within free-text responses across a variety of care settings and population demographics. While the algorithm yielded strong and reusable models in relatively stable environments, such as adult inpatient care settings, the initial accuracy was notably lower in organizations providing services such as pediatrics and mental health. However, the accuracy of our algorithm significantly improved when individual trust coding templates were applied. Thematic saturation was reached after the fifth organization was recruited, and no further coding was required for the last 4 organizations. Subsequently, a framework and pipeline for deployment of the algorithm were developed to provide a standardized approach for implementation and analysis of FFT free text, ensuring ease of use.

Conclusions: This study represents a significant step forward in leveraging free-text FFT data for valuable insights in diverse health care settings through the testing and development of a robust supervised learning text analytics algorithm. The disparity in some care settings was anticipated, given that the lexicon and phraseology used was inherently different from those prevalent in adult inpatient care (where the algorithm was developed). However, these challenges were addressed with further coding and testing. This approach enhanced the accuracy and reliability of the algorithm, encouraged inter- and intraorganizational collaboration, and shared learning.

Abstract Image

查看原文本刊更多论文

利用人工智能推动患者体验反馈的及时改进：算法验证。

背景：了解和改善病人护理是关键的卫生保健提供者。随着英国朋友和家人测试（FFT）数据量的增加，对这些患者反馈的人工分析给许多医疗保健组织带来了挑战。这强调了自动文本分析的重要性，特别是在实时预测情绪和主题方面。目的：利用机器学习和自然语言处理，本研究探索了监督算法的效用，以系统地测试和改进算法在不同医疗保健环境中的跨上下文性能，解决人口特征、地理位置和护理环境的变化，最终推动基于患者反馈的改进。方法：文本分析算法最初在伦敦的一个大型急性信托开发，并在英格兰各地不同护理环境的9个卫生保健组织中进一步测试。这些信托机构在技术能力和资源、人口统计和FFT自由文本数据集方面各不相同。执行了算法的测试和验证，包括回顾性注释子集的手动编码。技术基础设施，包括用于算法测试和部署的编码环境和包，都得到了优化。该算法使用来自匿名数据的单词包进行迭代训练，根据上下文变化进行定制，并测试算法性能的变化，同时纠正识别出的问题。结果：该算法在预测各种护理环境和人口统计数据中嵌入的自由文本响应中的主题和情绪方面表现出令人满意的总体准确性（>75%）。虽然该算法在相对稳定的环境（如成人住院护理环境）中产生了强大且可重复使用的模型，但在提供儿科和心理健康等服务的组织中，初始准确性明显较低。然而，当应用单个信任编码模板时，我们的算法的准确性显着提高。第五个组织招募后主题饱和，最后4个组织无需再编码。随后，开发了用于部署算法的框架和管道，为FFT自由文本的实现和分析提供了标准化方法，确保了易用性。结论：本研究通过测试和开发一种强大的监督学习文本分析算法，在利用自由文本FFT数据在各种医疗保健环境中获得有价值的见解方面迈出了重要的一步。考虑到使用的词汇和措辞与成人住院护理中流行的词汇和措辞本质上不同（算法是在那里开发的），一些护理环境中的差异是可以预料到的。然而，这些挑战是通过进一步的编码和测试解决的。这种方法提高了算法的准确性和可靠性，鼓励了组织间和组织内的协作以及共享学习。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

JMIR Medical Informatics Medicine-Health Informatics

CiteScore

7.90

自引率

3.10%

发文量

173

审稿时长

12 weeks

期刊介绍： JMIR Medical Informatics (JMI, ISSN 2291-9694) is a top-rated, tier A journal which focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation. It has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals. Published by JMIR Publications, publisher of the Journal of Medical Internet Research (JMIR), the leading eHealth/mHealth journal (Impact Factor 2016: 5.175), JMIR Med Inform has a slightly different scope (emphasizing more on applications for clinicians and health professionals rather than consumers/citizens, which is the focus of JMIR), publishes even faster, and also allows papers which are more technical or more formative than what would be published in the Journal of Medical Internet Research.