{"title":"An improved approach based on dynamic mixed sampling and transfer learning for topic recognition: a case study on online patient reviews","authors":"Yaotan Xie, Fei Xiang","doi":"10.1108/oir-01-2021-0059","DOIUrl":null,"url":null,"abstract":"PurposeThis study aimed to adapt existing text-mining techniques and propose a novel topic recognition approach for textual patient reviews.Design/methodology/approachThe authors first transformed multilabel samples for adapting model training forms. Then, an improved method was proposed based on dynamic mixed sampling and transfer learning to improve the learning problem caused by imbalanced samples. Specifically, the training of our model was based on the framework of a convolutional neural network and self-trained Word2Vector on large-scale corpora.FindingsCompared with the SVM and other CNN-based models, the CNN+ DMS + TL model proposed in this study has made significant improvement in F1 score.Originality/valueThe improved methods based on dynamic mixed sampling and transfer learning can adequately manage the learning problem caused by the skewed distribution of samples and achieve the effective and automatic topic recognition of textual patient reviews.Peer reviewThe peer-review history for this article is available at: https://publons.com/publon/10.1108/OIR-01-2021-0059.","PeriodicalId":143302,"journal":{"name":"Online Inf. Rev.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Online Inf. Rev.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/oir-01-2021-0059","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
PurposeThis study aimed to adapt existing text-mining techniques and propose a novel topic recognition approach for textual patient reviews.Design/methodology/approachThe authors first transformed multilabel samples for adapting model training forms. Then, an improved method was proposed based on dynamic mixed sampling and transfer learning to improve the learning problem caused by imbalanced samples. Specifically, the training of our model was based on the framework of a convolutional neural network and self-trained Word2Vector on large-scale corpora.FindingsCompared with the SVM and other CNN-based models, the CNN+ DMS + TL model proposed in this study has made significant improvement in F1 score.Originality/valueThe improved methods based on dynamic mixed sampling and transfer learning can adequately manage the learning problem caused by the skewed distribution of samples and achieve the effective and automatic topic recognition of textual patient reviews.Peer reviewThe peer-review history for this article is available at: https://publons.com/publon/10.1108/OIR-01-2021-0059.