Reviews Analysis of Korean Clinics Using LDA Topic Modeling

Cho-Myong Kim, Agneta Jo, Yang-Kyun Kim
{"title":"Reviews Analysis of Korean Clinics Using LDA Topic Modeling","authors":"Cho-Myong Kim, Agneta Jo, Yang-Kyun Kim","doi":"10.13048/jkm.22007","DOIUrl":null,"url":null,"abstract":"Objectives: In the health care industry, the influence of online reviews is growing. As medical services are provided mainly by providers, those services have been managed by hospitals and clinics. However, direct promotions of medical services by providers are legally forbidden. Due to this reason, consumers, like patients and clients, search a lot of reviews on the Internet to get any information about hospitals, treatments, prices, etc. It can be determined that online reviews indicate the quality of hospitals, and that analysis should be done for sustainable hospital marketing.Method: Using a Python-based crawler, we collected reviews, written by real patients, who had experienced Korean medicine, about more than 14,000 reviews. To extract the most representative words, reviews were divided by positive and negative; after that reviews were pre-processed to get only nouns and adjectives to get TF(Term Frequency), DF(Document Frequency), and TF-IDF(Term Frequency – Inverse Document Frequency). Finally, to get some topics about reviews, aggregations of extracted words were analyzed by using LDA(Latent Dirichlet Allocation) methods. To avoid overlap, the number of topics is set by Davis visualization.Results and Conclusions: 6 and 3 topics extracted in each positive/negative review, analyzed by LDA Topic Model. The main factors, consisting of topics were 1) Response to patients and customers. 2) Customized treatment (consultation) and management. 3) Hospital/Clinic’s environments.","PeriodicalId":16164,"journal":{"name":"Journal of Korean Medicine for Obesity Research","volume":"10 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Korean Medicine for Obesity Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13048/jkm.22007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: In the health care industry, the influence of online reviews is growing. As medical services are provided mainly by providers, those services have been managed by hospitals and clinics. However, direct promotions of medical services by providers are legally forbidden. Due to this reason, consumers, like patients and clients, search a lot of reviews on the Internet to get any information about hospitals, treatments, prices, etc. It can be determined that online reviews indicate the quality of hospitals, and that analysis should be done for sustainable hospital marketing.Method: Using a Python-based crawler, we collected reviews, written by real patients, who had experienced Korean medicine, about more than 14,000 reviews. To extract the most representative words, reviews were divided by positive and negative; after that reviews were pre-processed to get only nouns and adjectives to get TF(Term Frequency), DF(Document Frequency), and TF-IDF(Term Frequency – Inverse Document Frequency). Finally, to get some topics about reviews, aggregations of extracted words were analyzed by using LDA(Latent Dirichlet Allocation) methods. To avoid overlap, the number of topics is set by Davis visualization.Results and Conclusions: 6 and 3 topics extracted in each positive/negative review, analyzed by LDA Topic Model. The main factors, consisting of topics were 1) Response to patients and customers. 2) Customized treatment (consultation) and management. 3) Hospital/Clinic’s environments.
利用LDA主题模型对韩国诊所进行综述分析
目的:在医疗保健行业,在线评论的影响力越来越大。由于医疗服务主要由提供者提供,这些服务一直由医院和诊所管理。但是,法律禁止提供者直接宣传医疗服务。由于这个原因,消费者,像病人和客户一样,在互联网上搜索大量的评论,以获得有关医院,治疗,价格等的信息。可以确定的是,在线评论表明了医院的质量,应该对医院的可持续营销进行分析。方法:使用基于python的爬虫,我们收集了大约14000多条评论,这些评论是由真正体验过韩国医学的患者撰写的。为了提取最具代表性的词语,将评论分为正面和负面;然后对评论进行预处理,只得到名词和形容词,得到TF(Term Frequency)、DF(Document Frequency)和TF- idf (Term Frequency - Inverse Document Frequency)。最后,利用潜在狄利克雷分配(Latent Dirichlet Allocation, LDA)方法对提取的词进行聚类分析,得到与评论相关的主题。为了避免重叠,主题的数量由Davis可视化来设置。结果与结论:正面评价和负面评价各提取6个和3个主题,采用LDA主题模型进行分析。主要因素包括:1)对患者和客户的反应。2)定制治疗(咨询)和管理。医院/诊所的环境。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信