Identifying Patient-Reported Care Experiences in Free-Text Survey Comments: Topic Modeling Study.

IF 3.1 3区医学 Q2 MEDICAL INFORMATICS

JMIR Medical Informatics Pub Date : 2025-02-24 DOI:10.2196/63466

Brian Steele, Paul Fairie, Kyle Kemp, Adam G D'Souza, Matthias Wilms, Maria Jose Santana

{"title":"Identifying Patient-Reported Care Experiences in Free-Text Survey Comments: Topic Modeling Study.","authors":"Brian Steele, Paul Fairie, Kyle Kemp, Adam G D'Souza, Matthias Wilms, Maria Jose Santana","doi":"10.2196/63466","DOIUrl":null,"url":null,"abstract":"Background: Patient-reported experience surveys allow administrators, clinicians, and researchers to quantify and improve health care by receiving feedback directly from patients. Existing research has focused primarily on quantitative analysis of survey items, but these measures may collect optional free-text comments. These comments can provide insights for health systems but may not be analyzed due to limited resources and the complexity of traditional textual analysis. However, advances in machine learning-based natural language processing provide opportunities to learn from this traditionally underused data source.Objective: This study aimed to apply natural language processing to model topics found in free-text comments of patient-reported experience surveys.Methods: Consumer Assessment of Healthcare Providers and Systems-derived patient experience surveys were collected and linked to administrative inpatient records by the provincial health services organization responsible for inpatient care. Unsupervised topic modeling with automated labeling was performed with BERTopic. Sentiment analysis was performed to further assist in topic description.Results: Between April 2016 and February 2020, 43.4% (43,522/100,272) adult patients and 46.9% (3501/7464) pediatric caregivers included free-text responses on completed patient experience surveys. Topic models identified 86 topics among adult survey responses and 35 topics among pediatric responses that included elements of care not currently surveyed by existing questionnaires. Frequent topics were generally positive.Conclusions: We found that with limited tuning, BERTopic identified care experience topics with interpretable automated labeling. Results are discussed in the context of person-centered care, patient safety, and health care quality improvement. Furthermore, we note the opportunity for the identification of temporal and site-specific trends as a method to identify patient care and safety concerns. As the use of patient experience measurement increases in health care, we discuss how machine learning can be leveraged to provide additional insight on patient experiences.","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e63466"},"PeriodicalIF":3.1000,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11875393/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/63466","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Patient-reported experience surveys allow administrators, clinicians, and researchers to quantify and improve health care by receiving feedback directly from patients. Existing research has focused primarily on quantitative analysis of survey items, but these measures may collect optional free-text comments. These comments can provide insights for health systems but may not be analyzed due to limited resources and the complexity of traditional textual analysis. However, advances in machine learning-based natural language processing provide opportunities to learn from this traditionally underused data source.

Objective: This study aimed to apply natural language processing to model topics found in free-text comments of patient-reported experience surveys.

Methods: Consumer Assessment of Healthcare Providers and Systems-derived patient experience surveys were collected and linked to administrative inpatient records by the provincial health services organization responsible for inpatient care. Unsupervised topic modeling with automated labeling was performed with BERTopic. Sentiment analysis was performed to further assist in topic description.

Results: Between April 2016 and February 2020, 43.4% (43,522/100,272) adult patients and 46.9% (3501/7464) pediatric caregivers included free-text responses on completed patient experience surveys. Topic models identified 86 topics among adult survey responses and 35 topics among pediatric responses that included elements of care not currently surveyed by existing questionnaires. Frequent topics were generally positive.

Conclusions: We found that with limited tuning, BERTopic identified care experience topics with interpretable automated labeling. Results are discussed in the context of person-centered care, patient safety, and health care quality improvement. Furthermore, we note the opportunity for the identification of temporal and site-specific trends as a method to identify patient care and safety concerns. As the use of patient experience measurement increases in health care, we discuss how machine learning can be leveraged to provide additional insight on patient experiences.

查看原文本刊更多论文

在自由文本调查评论中识别患者报告的护理经验：主题建模研究。

背景：患者报告的经验调查允许管理者、临床医生和研究人员通过直接从患者那里获得反馈来量化和改善医疗保健。现有的研究主要集中在调查项目的定量分析，但这些措施可能收集可选的自由文本评论。这些评论可以为卫生系统提供见解，但由于资源有限和传统文本分析的复杂性，可能无法对其进行分析。然而，基于机器学习的自然语言处理的进步提供了从这个传统上未充分利用的数据源中学习的机会。目的：利用自然语言处理技术对患者报告经验调查中自由文本评论中的主题进行建模。方法：收集消费者对医疗服务提供者的评估和系统衍生的患者体验调查，并将其与负责住院护理的省级卫生服务组织的行政住院记录联系起来。使用BERTopic进行自动标记的无监督主题建模。进行情感分析以进一步协助主题描述。结果：在2016年4月至2020年2月期间，43.4%（43,522/100,272）成年患者和46.9%（3501/7464）儿科护理人员在完成的患者体验调查中包含自由文本回复。主题模型在成人调查回复中确定了86个主题，在儿科回复中确定了35个主题，其中包括目前未被现有问卷调查的护理要素。频繁的话题通常是积极的。结论：我们发现，通过有限的调整，BERTopic识别出具有可解释自动标签的护理经验主题。结果在以人为本的护理，患者安全和卫生保健质量改进的背景下进行了讨论。此外，我们注意到识别时间和地点特定趋势的机会，作为识别患者护理和安全问题的方法。随着医疗保健中患者体验测量的使用增加，我们讨论了如何利用机器学习来提供有关患者体验的额外见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

JMIR Medical Informatics Medicine-Health Informatics

CiteScore

7.90

自引率

3.10%

发文量

173

审稿时长

12 weeks

期刊介绍： JMIR Medical Informatics (JMI, ISSN 2291-9694) is a top-rated, tier A journal which focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation. It has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals. Published by JMIR Publications, publisher of the Journal of Medical Internet Research (JMIR), the leading eHealth/mHealth journal (Impact Factor 2016: 5.175), JMIR Med Inform has a slightly different scope (emphasizing more on applications for clinicians and health professionals rather than consumers/citizens, which is the focus of JMIR), publishes even faster, and also allows papers which are more technical or more formative than what would be published in the Journal of Medical Internet Research.