{"title":"基于预训练语言模型和迁移学习的中韩微博情感分类","authors":"Hengxuan Wang, Zhenguo Zhang, Xu Cui, Rong-yi Cui","doi":"10.1109/CCAI55564.2022.9807755","DOIUrl":null,"url":null,"abstract":"Korean is the native and official language spoken by Chinese-Korean people, and Weibo is a social media software with a huge number of users in China. Currently, there is few studies related to sentiment analysis of Korean-language Weibo texts posted by Chinese-Korean users. In this paper, we propose a sentiment classification method for Chinese-Korean Weibo based on pre-trained language model and transfer learning. Firstly, we crawled the Chinese-Korean Weibo data from Sina Weibo and label them with sentiment to get the Chinese-Korean Weibo sentiment analysis (CKWSA) dataset. Secondly, to solve the problem of few training samples of the Chinese-Korean Weibo sentiment analysis dataset, we fine-tune the classifier based on the pre-trained Korean language model on the Korean Twitter sentiment analysis dataset to obtain the Korean Twitter sentiment classification model; and further fine-tune the model on CKWSA dataset to get Chinese-Korean Weibo sentiment classification model. The experiments show that the proposed classification method based on pre-trained language model and transfer learning has great performance, and there is an improvement compared other baselines on the Chinese-Korean Weibo sentiment analysis dataset.","PeriodicalId":340195,"journal":{"name":"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Chinese-Korean Weibo Sentiment Classification Based on Pre-trained Language Model and Transfer Learning\",\"authors\":\"Hengxuan Wang, Zhenguo Zhang, Xu Cui, Rong-yi Cui\",\"doi\":\"10.1109/CCAI55564.2022.9807755\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Korean is the native and official language spoken by Chinese-Korean people, and Weibo is a social media software with a huge number of users in China. Currently, there is few studies related to sentiment analysis of Korean-language Weibo texts posted by Chinese-Korean users. In this paper, we propose a sentiment classification method for Chinese-Korean Weibo based on pre-trained language model and transfer learning. Firstly, we crawled the Chinese-Korean Weibo data from Sina Weibo and label them with sentiment to get the Chinese-Korean Weibo sentiment analysis (CKWSA) dataset. Secondly, to solve the problem of few training samples of the Chinese-Korean Weibo sentiment analysis dataset, we fine-tune the classifier based on the pre-trained Korean language model on the Korean Twitter sentiment analysis dataset to obtain the Korean Twitter sentiment classification model; and further fine-tune the model on CKWSA dataset to get Chinese-Korean Weibo sentiment classification model. The experiments show that the proposed classification method based on pre-trained language model and transfer learning has great performance, and there is an improvement compared other baselines on the Chinese-Korean Weibo sentiment analysis dataset.\",\"PeriodicalId\":340195,\"journal\":{\"name\":\"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCAI55564.2022.9807755\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCAI55564.2022.9807755","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Chinese-Korean Weibo Sentiment Classification Based on Pre-trained Language Model and Transfer Learning
Korean is the native and official language spoken by Chinese-Korean people, and Weibo is a social media software with a huge number of users in China. Currently, there is few studies related to sentiment analysis of Korean-language Weibo texts posted by Chinese-Korean users. In this paper, we propose a sentiment classification method for Chinese-Korean Weibo based on pre-trained language model and transfer learning. Firstly, we crawled the Chinese-Korean Weibo data from Sina Weibo and label them with sentiment to get the Chinese-Korean Weibo sentiment analysis (CKWSA) dataset. Secondly, to solve the problem of few training samples of the Chinese-Korean Weibo sentiment analysis dataset, we fine-tune the classifier based on the pre-trained Korean language model on the Korean Twitter sentiment analysis dataset to obtain the Korean Twitter sentiment classification model; and further fine-tune the model on CKWSA dataset to get Chinese-Korean Weibo sentiment classification model. The experiments show that the proposed classification method based on pre-trained language model and transfer learning has great performance, and there is an improvement compared other baselines on the Chinese-Korean Weibo sentiment analysis dataset.