V. K. Agbesi, Chen Wenyu, Abush S. Ameneshewa, E. Odame, Koffi Dumor, Judith Ayekai Browne
{"title":"Efficient Adaptive Convolutional Model Based on Label Embedding for Text Classification Using Low Resource Languages","authors":"V. K. Agbesi, Chen Wenyu, Abush S. Ameneshewa, E. Odame, Koffi Dumor, Judith Ayekai Browne","doi":"10.1145/3596947.3596962","DOIUrl":null,"url":null,"abstract":"Text classification technology has been efficiently deployed in numerous organizational applications, including subject tagging, intent, event detection, spam filtering, and email routing. This also helps organizations streamline processes, enhance data-driven operations, and evaluate and analyze textual resources quickly and economically. This progress results from numerous studies on high-resource language-based text classification tasks. However, research in low-resource languages, including Ewe, Arabic, Filipino, and Kazakh, lags behind other high-resource languages like English. Also, the most difficult aspect of text classification using low-resource languages is identifying the optimal set of filters for its feature extraction. This is due to their complex morphology, linguistic diversity, multilingualism, and syntax. Studies that have explored these problems failed to efficiently use label information to better the performance of their methods. As a result, the label information for these languages needs to be adequately utilized to enhance classification results. To solve this problem, this study proposes an efficient adaptive convolutional model based on label embedding (EAdaCLE) to efficiently represent label information and utilize the learned label representations for various text classification tasks. EAdaCLE has adaptively engineered convolutional filters trained on inputs based on label embeddings generated in the same network as the text vectors. EAdaCLE ensures the adaptability of adaptive convolution and completely obtains label data as a supporting function to enhance the classification results. Extensive experiments indicate that our technique is more reliable than other methods on four low-resource public datasets.","PeriodicalId":183071,"journal":{"name":"Proceedings of the 2023 7th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 7th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3596947.3596962","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Text classification technology has been efficiently deployed in numerous organizational applications, including subject tagging, intent, event detection, spam filtering, and email routing. This also helps organizations streamline processes, enhance data-driven operations, and evaluate and analyze textual resources quickly and economically. This progress results from numerous studies on high-resource language-based text classification tasks. However, research in low-resource languages, including Ewe, Arabic, Filipino, and Kazakh, lags behind other high-resource languages like English. Also, the most difficult aspect of text classification using low-resource languages is identifying the optimal set of filters for its feature extraction. This is due to their complex morphology, linguistic diversity, multilingualism, and syntax. Studies that have explored these problems failed to efficiently use label information to better the performance of their methods. As a result, the label information for these languages needs to be adequately utilized to enhance classification results. To solve this problem, this study proposes an efficient adaptive convolutional model based on label embedding (EAdaCLE) to efficiently represent label information and utilize the learned label representations for various text classification tasks. EAdaCLE has adaptively engineered convolutional filters trained on inputs based on label embeddings generated in the same network as the text vectors. EAdaCLE ensures the adaptability of adaptive convolution and completely obtains label data as a supporting function to enhance the classification results. Extensive experiments indicate that our technique is more reliable than other methods on four low-resource public datasets.