{"title":"基于朝鲜语的多粒度集成学习研究","authors":"Jingxuan Jin, Yahui Zhao, Rong-yi Cui","doi":"10.1145/3448734.3450777","DOIUrl":null,"url":null,"abstract":"Ensemble learning can train and combine multiple classifiers where the predictions are used as new features to train a meta-classifier. This improves the accuracy of the model. This paper proposes a multi granularity model based on Stacking ensemble learning for Korean text classification. Firstly, eojeol and subeojeol granularity is proposed according to the Korean language composition. Since different feature granularity contains different semantic information, compare the six different granularities of the phoneme, syllable, subword, word, subeojeol, and eojeol in Korean text classification task. Secondly, construct suffix words based on Korean grammatical morphology and compare the different granularities effects after suffix preprocessing. Finally, propose a multi granularity ensemble learning model based on Korean called MGEL-K. To enrich the diversity of ensemble learning using different granularities, making differences between learners. The results show that MGEL-K model proposed in this paper works best in the Korean text classification task with an accuracy of 92.33%.","PeriodicalId":105999,"journal":{"name":"The 2nd International Conference on Computing and Data Science","volume":"222 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Research on Multi-granularity Ensemble Learning Based on Korean\",\"authors\":\"Jingxuan Jin, Yahui Zhao, Rong-yi Cui\",\"doi\":\"10.1145/3448734.3450777\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ensemble learning can train and combine multiple classifiers where the predictions are used as new features to train a meta-classifier. This improves the accuracy of the model. This paper proposes a multi granularity model based on Stacking ensemble learning for Korean text classification. Firstly, eojeol and subeojeol granularity is proposed according to the Korean language composition. Since different feature granularity contains different semantic information, compare the six different granularities of the phoneme, syllable, subword, word, subeojeol, and eojeol in Korean text classification task. Secondly, construct suffix words based on Korean grammatical morphology and compare the different granularities effects after suffix preprocessing. Finally, propose a multi granularity ensemble learning model based on Korean called MGEL-K. To enrich the diversity of ensemble learning using different granularities, making differences between learners. The results show that MGEL-K model proposed in this paper works best in the Korean text classification task with an accuracy of 92.33%.\",\"PeriodicalId\":105999,\"journal\":{\"name\":\"The 2nd International Conference on Computing and Data Science\",\"volume\":\"222 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The 2nd International Conference on Computing and Data Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3448734.3450777\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 2nd International Conference on Computing and Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3448734.3450777","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on Multi-granularity Ensemble Learning Based on Korean
Ensemble learning can train and combine multiple classifiers where the predictions are used as new features to train a meta-classifier. This improves the accuracy of the model. This paper proposes a multi granularity model based on Stacking ensemble learning for Korean text classification. Firstly, eojeol and subeojeol granularity is proposed according to the Korean language composition. Since different feature granularity contains different semantic information, compare the six different granularities of the phoneme, syllable, subword, word, subeojeol, and eojeol in Korean text classification task. Secondly, construct suffix words based on Korean grammatical morphology and compare the different granularities effects after suffix preprocessing. Finally, propose a multi granularity ensemble learning model based on Korean called MGEL-K. To enrich the diversity of ensemble learning using different granularities, making differences between learners. The results show that MGEL-K model proposed in this paper works best in the Korean text classification task with an accuracy of 92.33%.