An improved approach based on dynamic mixed sampling and transfer learning for topic recognition: a case study on online patient reviews

Yaotan Xie, Fei Xiang
{"title":"An improved approach based on dynamic mixed sampling and transfer learning for topic recognition: a case study on online patient reviews","authors":"Yaotan Xie, Fei Xiang","doi":"10.1108/oir-01-2021-0059","DOIUrl":null,"url":null,"abstract":"PurposeThis study aimed to adapt existing text-mining techniques and propose a novel topic recognition approach for textual patient reviews.Design/methodology/approachThe authors first transformed multilabel samples for adapting model training forms. Then, an improved method was proposed based on dynamic mixed sampling and transfer learning to improve the learning problem caused by imbalanced samples. Specifically, the training of our model was based on the framework of a convolutional neural network and self-trained Word2Vector on large-scale corpora.FindingsCompared with the SVM and other CNN-based models, the CNN+ DMS + TL model proposed in this study has made significant improvement in F1 score.Originality/valueThe improved methods based on dynamic mixed sampling and transfer learning can adequately manage the learning problem caused by the skewed distribution of samples and achieve the effective and automatic topic recognition of textual patient reviews.Peer reviewThe peer-review history for this article is available at: https://publons.com/publon/10.1108/OIR-01-2021-0059.","PeriodicalId":143302,"journal":{"name":"Online Inf. Rev.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Online Inf. Rev.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/oir-01-2021-0059","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

PurposeThis study aimed to adapt existing text-mining techniques and propose a novel topic recognition approach for textual patient reviews.Design/methodology/approachThe authors first transformed multilabel samples for adapting model training forms. Then, an improved method was proposed based on dynamic mixed sampling and transfer learning to improve the learning problem caused by imbalanced samples. Specifically, the training of our model was based on the framework of a convolutional neural network and self-trained Word2Vector on large-scale corpora.FindingsCompared with the SVM and other CNN-based models, the CNN+ DMS + TL model proposed in this study has made significant improvement in F1 score.Originality/valueThe improved methods based on dynamic mixed sampling and transfer learning can adequately manage the learning problem caused by the skewed distribution of samples and achieve the effective and automatic topic recognition of textual patient reviews.Peer reviewThe peer-review history for this article is available at: https://publons.com/publon/10.1108/OIR-01-2021-0059.
一种基于动态混合采样和迁移学习的主题识别改进方法:以在线患者评论为例
目的本研究旨在改进现有的文本挖掘技术,提出一种新的文本患者评论主题识别方法。设计/方法/方法作者首先转换多标签样本以适应模型训练形式。然后,提出了一种基于动态混合采样和迁移学习的改进方法,以改善样本不平衡导致的学习问题。具体来说,我们的模型的训练是基于卷积神经网络框架和大规模语料库上自训练的Word2Vector。与SVM和其他基于CNN的模型相比,本研究提出的CNN+ DMS + TL模型在F1得分上有显著提高。基于动态混合采样和迁移学习的改进方法可以很好地解决样本分布偏态带来的学习问题,实现文本患者点评的有效、自动的主题识别。同行评议这篇文章的同行评议历史可以在:https://publons.com/publon/10.1108/OIR-01-2021-0059。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信