Analyzing Patient Complaints in Web-Based Reviews of Private Hospitals in Selangor, Malaysia, Using Large Language Model-Assisted Content Analysis: Mixed Methods Study.

IF 2 Q3 HEALTH CARE SCIENCES & SERVICES

JMIR Formative Research Pub Date : 2025-06-27 DOI:10.2196/69075

Muhammad Hafiz Sulaiman, Nora Muda, Fatimah Abdul Razak

{"title":"Analyzing Patient Complaints in Web-Based Reviews of Private Hospitals in Selangor, Malaysia, Using Large Language Model-Assisted Content Analysis: Mixed Methods Study.","authors":"Muhammad Hafiz Sulaiman, Nora Muda, Fatimah Abdul Razak","doi":"10.2196/69075","DOIUrl":null,"url":null,"abstract":"Background: Large language model (LLM)-assisted content analysis (LACA) is a modification of traditional content analysis, leveraging the LLM to codevelop codebooks and automatically assign thematic codes to a web-based reviews dataset.Objective: This study aims to develop and validate the use of LACA for analyzing hospital web-based reviews and to identify themes of issues from web-based reviews using this method.Methods: Web-based reviews for 53 private hospitals in Selangor, Malaysia, were acquired. Fake reviews were filtered out using natural language processing and machine learning algorithms trained on yelp.com validated datasets. GPT-4o mini model application programming interface (API) was then applied to filter out reviews without any quality issues. In total, 200 of the remaining reviews were randomly extracted and fed into the GPT-4o mini model API to produce a codebook validated through parallel human-LLM coding to establish interrater reliability. The codebook was then used to code (label) all reviews in the dataset. The thematic codes were then summarized into themes using factor analysis to increase interpretability.Results: A total of 14,938 web-based reviews were acquired, of which 1121 (9.3%) were fake, 1279 (12%) contained negative sentiments, and 9635 (88%) did not contain any negative sentiment. GPT-4o mini model subsequently inducted 41 thematic codes together with their definitions. Average human-GPT interrater reliability is perfect (κ=0.81). Factor analysis identified 6 interpretable latent factors: \"Service and Communication Effectiveness,\" \"Clinical Care and Patient Experience,\" \"Facilities and Amenities Quality,\" \"Appointment and Patient Flow,\" \"Financial and Insurance Management,\" and \"Patient Rights and Accessibility.\" The cumulative explained variance for the six factors is 0.74, and Cronbach α is between 0.88 and 0.97 (good and excellent) for all factors except factor 6 (0.61: questionable). The factors identified follow a global pattern of issues identified from the literature.Conclusions: A data collection and processing pipeline consisting of Python Selenium, the GPT-4o mini model API, and a factor analysis module can support valid and reliable thematic analysis. Despite the potential for collection and information bias in web-based reviews, LACA of web-based reviews is cost-effective, time-efficient, and can be performed in real time, helping hospital managers develop hypotheses for further investigations promptly.","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"9 ","pages":"e69075"},"PeriodicalIF":2.0000,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12254706/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Formative Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/69075","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Large language model (LLM)-assisted content analysis (LACA) is a modification of traditional content analysis, leveraging the LLM to codevelop codebooks and automatically assign thematic codes to a web-based reviews dataset.

Objective: This study aims to develop and validate the use of LACA for analyzing hospital web-based reviews and to identify themes of issues from web-based reviews using this method.

Methods: Web-based reviews for 53 private hospitals in Selangor, Malaysia, were acquired. Fake reviews were filtered out using natural language processing and machine learning algorithms trained on yelp.com validated datasets. GPT-4o mini model application programming interface (API) was then applied to filter out reviews without any quality issues. In total, 200 of the remaining reviews were randomly extracted and fed into the GPT-4o mini model API to produce a codebook validated through parallel human-LLM coding to establish interrater reliability. The codebook was then used to code (label) all reviews in the dataset. The thematic codes were then summarized into themes using factor analysis to increase interpretability.

Results: A total of 14,938 web-based reviews were acquired, of which 1121 (9.3%) were fake, 1279 (12%) contained negative sentiments, and 9635 (88%) did not contain any negative sentiment. GPT-4o mini model subsequently inducted 41 thematic codes together with their definitions. Average human-GPT interrater reliability is perfect (κ=0.81). Factor analysis identified 6 interpretable latent factors: "Service and Communication Effectiveness," "Clinical Care and Patient Experience," "Facilities and Amenities Quality," "Appointment and Patient Flow," "Financial and Insurance Management," and "Patient Rights and Accessibility." The cumulative explained variance for the six factors is 0.74, and Cronbach α is between 0.88 and 0.97 (good and excellent) for all factors except factor 6 (0.61: questionable). The factors identified follow a global pattern of issues identified from the literature.

Conclusions: A data collection and processing pipeline consisting of Python Selenium, the GPT-4o mini model API, and a factor analysis module can support valid and reliable thematic analysis. Despite the potential for collection and information bias in web-based reviews, LACA of web-based reviews is cost-effective, time-efficient, and can be performed in real time, helping hospital managers develop hypotheses for further investigations promptly.

查看原文本刊更多论文

使用大型语言模型辅助内容分析，分析马来西亚雪兰莪私立医院网上评论中的患者投诉：混合方法研究。

背景：大型语言模型（LLM）辅助内容分析（LACA）是对传统内容分析的改进，利用LLM共同开发代码本并自动将主题代码分配给基于web的评论数据集。目的：本研究旨在开发和验证LACA用于分析医院基于网络的评论，并使用该方法从基于网络的评论中确定问题的主题。方法：对马来西亚雪兰莪州53家私立医院进行网络评价。使用自然语言处理和机器学习算法过滤掉虚假评论，这些算法是在yelp.com验证的数据集上训练的。然后应用gpt - 40迷你模型应用程序编程接口（API）过滤掉没有任何质量问题的评审。总共，200个剩余的评论被随机提取并输入到gpt - 40迷你模型API中，以产生一个通过并行human-LLM编码验证的代码本，以建立相互之间的可靠性。然后使用代码本对数据集中的所有评论进行编码（标记）。然后使用因子分析将主题代码总结为主题，以增加可解释性。结果：共获取网络评论14938条，其中虚假评论1121条（9.3%），负面评论1279条（12%），不含负面评论9635条（88%）。gpt - 40迷你模型随后归纳了41个主题代码及其定义。人- gpt互译器的平均信度是完美的（κ=0.81）。因子分析确定了6个可解释的潜在因素：“服务和沟通有效性”、“临床护理和患者体验”、“设施和设施质量”、“预约和患者流量”、“财务和保险管理”以及“患者权利和可及性”。六个因素的累积解释方差为0.74，除因素6（0.61：可疑）外，所有因素的Cronbach α在0.88和0.97之间（良好和优秀）。所确定的因素遵循从文献中确定的问题的全球模式。结论：由Python Selenium、gpt - 40迷你模型API和因子分析模块组成的数据采集和处理流水线可以支持有效、可靠的专题分析。尽管基于网络的评价可能存在收集和信息偏差，但基于网络的评价的LACA具有成本效益，时间效率高，并且可以实时执行，帮助医院管理人员及时制定进一步调查的假设。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊