Brian Steele, Paul Fairie, Kyle Kemp, Adam G D'Souza, Matthias Wilms, Maria Jose Santana
{"title":"Identifying Patient-Reported Care Experiences in Free-Text Survey Comments: Topic Modeling Study.","authors":"Brian Steele, Paul Fairie, Kyle Kemp, Adam G D'Souza, Matthias Wilms, Maria Jose Santana","doi":"10.2196/63466","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Patient-reported experience surveys allow administrators, clinicians, and researchers to quantify and improve health care by receiving feedback directly from patients. Existing research has focused primarily on quantitative analysis of survey items, but these measures may collect optional free-text comments. These comments can provide insights for health systems but may not be analyzed due to limited resources and the complexity of traditional textual analysis. However, advances in machine learning-based natural language processing provide opportunities to learn from this traditionally underused data source.</p><p><strong>Objective: </strong>This study aimed to apply natural language processing to model topics found in free-text comments of patient-reported experience surveys.</p><p><strong>Methods: </strong>Consumer Assessment of Healthcare Providers and Systems-derived patient experience surveys were collected and linked to administrative inpatient records by the provincial health services organization responsible for inpatient care. Unsupervised topic modeling with automated labeling was performed with BERTopic. Sentiment analysis was performed to further assist in topic description.</p><p><strong>Results: </strong>Between April 2016 and February 2020, 43.4% (43,522/100,272) adult patients and 46.9% (3501/7464) pediatric caregivers included free-text responses on completed patient experience surveys. Topic models identified 86 topics among adult survey responses and 35 topics among pediatric responses that included elements of care not currently surveyed by existing questionnaires. Frequent topics were generally positive.</p><p><strong>Conclusions: </strong>We found that with limited tuning, BERTopic identified care experience topics with interpretable automated labeling. Results are discussed in the context of person-centered care, patient safety, and health care quality improvement. Furthermore, we note the opportunity for the identification of temporal and site-specific trends as a method to identify patient care and safety concerns. As the use of patient experience measurement increases in health care, we discuss how machine learning can be leveraged to provide additional insight on patient experiences.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e63466"},"PeriodicalIF":3.1000,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11875393/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/63466","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Patient-reported experience surveys allow administrators, clinicians, and researchers to quantify and improve health care by receiving feedback directly from patients. Existing research has focused primarily on quantitative analysis of survey items, but these measures may collect optional free-text comments. These comments can provide insights for health systems but may not be analyzed due to limited resources and the complexity of traditional textual analysis. However, advances in machine learning-based natural language processing provide opportunities to learn from this traditionally underused data source.
Objective: This study aimed to apply natural language processing to model topics found in free-text comments of patient-reported experience surveys.
Methods: Consumer Assessment of Healthcare Providers and Systems-derived patient experience surveys were collected and linked to administrative inpatient records by the provincial health services organization responsible for inpatient care. Unsupervised topic modeling with automated labeling was performed with BERTopic. Sentiment analysis was performed to further assist in topic description.
Results: Between April 2016 and February 2020, 43.4% (43,522/100,272) adult patients and 46.9% (3501/7464) pediatric caregivers included free-text responses on completed patient experience surveys. Topic models identified 86 topics among adult survey responses and 35 topics among pediatric responses that included elements of care not currently surveyed by existing questionnaires. Frequent topics were generally positive.
Conclusions: We found that with limited tuning, BERTopic identified care experience topics with interpretable automated labeling. Results are discussed in the context of person-centered care, patient safety, and health care quality improvement. Furthermore, we note the opportunity for the identification of temporal and site-specific trends as a method to identify patient care and safety concerns. As the use of patient experience measurement increases in health care, we discuss how machine learning can be leveraged to provide additional insight on patient experiences.
期刊介绍:
JMIR Medical Informatics (JMI, ISSN 2291-9694) is a top-rated, tier A journal which focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation. It has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals.
Published by JMIR Publications, publisher of the Journal of Medical Internet Research (JMIR), the leading eHealth/mHealth journal (Impact Factor 2016: 5.175), JMIR Med Inform has a slightly different scope (emphasizing more on applications for clinicians and health professionals rather than consumers/citizens, which is the focus of JMIR), publishes even faster, and also allows papers which are more technical or more formative than what would be published in the Journal of Medical Internet Research.