{"title":"Improving Multi-model Hybrid Chinese Long-text Classification through BERT Optimisation","authors":"Yu Wang, He Huang, Yunni Xia","doi":"10.1109/ICNSC55942.2022.10004130","DOIUrl":null,"url":null,"abstract":"Text classification is an almost unavoidable process in natural language processing and has a wide range of application scenarios in industry. Although many existing methods can achieve superior classification results, raising the effect of text classification not only poses a great challenge, but also provides a longitudinal study of technological improvement. Based on the pre-trained bidirectional encoder representations from transformer (BERT) model and in-depth research on deep learning, we propose a multi-model, mixed-Chinese classification model (MCCM) based on BERT (MCCM-BERT) to process Chinese text-classification tasks. The experimental results show that the proposed MCCM BERT model outperforms BERT in text classification tasks, especially in Chinese long text classification, with an accuracy improvement of up to 2.28%.","PeriodicalId":230499,"journal":{"name":"2022 IEEE International Conference on Networking, Sensing and Control (ICNSC)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Networking, Sensing and Control (ICNSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNSC55942.2022.10004130","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Text classification is an almost unavoidable process in natural language processing and has a wide range of application scenarios in industry. Although many existing methods can achieve superior classification results, raising the effect of text classification not only poses a great challenge, but also provides a longitudinal study of technological improvement. Based on the pre-trained bidirectional encoder representations from transformer (BERT) model and in-depth research on deep learning, we propose a multi-model, mixed-Chinese classification model (MCCM) based on BERT (MCCM-BERT) to process Chinese text-classification tasks. The experimental results show that the proposed MCCM BERT model outperforms BERT in text classification tasks, especially in Chinese long text classification, with an accuracy improvement of up to 2.28%.