{"title":"一种基于动态混合采样和迁移学习的主题识别改进方法:以在线患者评论为例","authors":"Yaotan Xie, Fei Xiang","doi":"10.1108/oir-01-2021-0059","DOIUrl":null,"url":null,"abstract":"PurposeThis study aimed to adapt existing text-mining techniques and propose a novel topic recognition approach for textual patient reviews.Design/methodology/approachThe authors first transformed multilabel samples for adapting model training forms. Then, an improved method was proposed based on dynamic mixed sampling and transfer learning to improve the learning problem caused by imbalanced samples. Specifically, the training of our model was based on the framework of a convolutional neural network and self-trained Word2Vector on large-scale corpora.FindingsCompared with the SVM and other CNN-based models, the CNN+ DMS + TL model proposed in this study has made significant improvement in F1 score.Originality/valueThe improved methods based on dynamic mixed sampling and transfer learning can adequately manage the learning problem caused by the skewed distribution of samples and achieve the effective and automatic topic recognition of textual patient reviews.Peer reviewThe peer-review history for this article is available at: https://publons.com/publon/10.1108/OIR-01-2021-0059.","PeriodicalId":143302,"journal":{"name":"Online Inf. Rev.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"An improved approach based on dynamic mixed sampling and transfer learning for topic recognition: a case study on online patient reviews\",\"authors\":\"Yaotan Xie, Fei Xiang\",\"doi\":\"10.1108/oir-01-2021-0059\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"PurposeThis study aimed to adapt existing text-mining techniques and propose a novel topic recognition approach for textual patient reviews.Design/methodology/approachThe authors first transformed multilabel samples for adapting model training forms. Then, an improved method was proposed based on dynamic mixed sampling and transfer learning to improve the learning problem caused by imbalanced samples. Specifically, the training of our model was based on the framework of a convolutional neural network and self-trained Word2Vector on large-scale corpora.FindingsCompared with the SVM and other CNN-based models, the CNN+ DMS + TL model proposed in this study has made significant improvement in F1 score.Originality/valueThe improved methods based on dynamic mixed sampling and transfer learning can adequately manage the learning problem caused by the skewed distribution of samples and achieve the effective and automatic topic recognition of textual patient reviews.Peer reviewThe peer-review history for this article is available at: https://publons.com/publon/10.1108/OIR-01-2021-0059.\",\"PeriodicalId\":143302,\"journal\":{\"name\":\"Online Inf. Rev.\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Online Inf. Rev.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1108/oir-01-2021-0059\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Online Inf. Rev.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/oir-01-2021-0059","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An improved approach based on dynamic mixed sampling and transfer learning for topic recognition: a case study on online patient reviews
PurposeThis study aimed to adapt existing text-mining techniques and propose a novel topic recognition approach for textual patient reviews.Design/methodology/approachThe authors first transformed multilabel samples for adapting model training forms. Then, an improved method was proposed based on dynamic mixed sampling and transfer learning to improve the learning problem caused by imbalanced samples. Specifically, the training of our model was based on the framework of a convolutional neural network and self-trained Word2Vector on large-scale corpora.FindingsCompared with the SVM and other CNN-based models, the CNN+ DMS + TL model proposed in this study has made significant improvement in F1 score.Originality/valueThe improved methods based on dynamic mixed sampling and transfer learning can adequately manage the learning problem caused by the skewed distribution of samples and achieve the effective and automatic topic recognition of textual patient reviews.Peer reviewThe peer-review history for this article is available at: https://publons.com/publon/10.1108/OIR-01-2021-0059.