An improved approach based on dynamic mixed sampling and transfer learning for topic recognition: a case study on online patient reviews

Online Inf. Rev. Pub Date : 2022-02-01 DOI:10.1108/oir-01-2021-0059

Yaotan Xie, Fei Xiang

引用次数: 2

Abstract

PurposeThis study aimed to adapt existing text-mining techniques and propose a novel topic recognition approach for textual patient reviews.Design/methodology/approachThe authors first transformed multilabel samples for adapting model training forms. Then, an improved method was proposed based on dynamic mixed sampling and transfer learning to improve the learning problem caused by imbalanced samples. Specifically, the training of our model was based on the framework of a convolutional neural network and self-trained Word2Vector on large-scale corpora.FindingsCompared with the SVM and other CNN-based models, the CNN+ DMS + TL model proposed in this study has made significant improvement in F1 score.Originality/valueThe improved methods based on dynamic mixed sampling and transfer learning can adequately manage the learning problem caused by the skewed distribution of samples and achieve the effective and automatic topic recognition of textual patient reviews.Peer reviewThe peer-review history for this article is available at: https://publons.com/publon/10.1108/OIR-01-2021-0059.

查看原文本刊更多论文

一种基于动态混合采样和迁移学习的主题识别改进方法:以在线患者评论为例

目的本研究旨在改进现有的文本挖掘技术，提出一种新的文本患者评论主题识别方法。设计/方法/方法作者首先转换多标签样本以适应模型训练形式。然后，提出了一种基于动态混合采样和迁移学习的改进方法，以改善样本不平衡导致的学习问题。具体来说，我们的模型的训练是基于卷积神经网络框架和大规模语料库上自训练的Word2Vector。与SVM和其他基于CNN的模型相比，本研究提出的CNN+ DMS + TL模型在F1得分上有显著提高。基于动态混合采样和迁移学习的改进方法可以很好地解决样本分布偏态带来的学习问题，实现文本患者点评的有效、自动的主题识别。同行评议这篇文章的同行评议历史可以在:https://publons.com/publon/10.1108/OIR-01-2021-0059。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Online Inf. Rev.

自引率

0.00%

发文量