基于深度学习的多标签阿拉伯语文本分类

Batool Alsukhni
{"title":"基于深度学习的多标签阿拉伯语文本分类","authors":"Batool Alsukhni","doi":"10.1109/ICICS52457.2021.9464538","DOIUrl":null,"url":null,"abstract":"Multi-label text classification is a natural extension of text classification in which each document can be assigned with a possible widespread set of labels. Natural Language Processing (NLP) helps to understand and manipulate text in natural language by using the computer. Arabic Text Classification is challenging recently because the Arabic language is under-resourced although it has many users. The aim of this paper is to build a model to classify Arabic news and help users get and display the most relevant news to their interests. In this paper, we demonstrate the efficiency of using deep learning models in solving Arabic multi-label text classification problem. Multilayer Perceptron (MLP) and Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM) have been used; we build two models via python. All data has been cleaned to improve the quality of experimental data. The result of test data in LSTM was 82.03 whereas in the MLP model was 80.37, and both models were evaluated using F1 score.","PeriodicalId":421803,"journal":{"name":"2021 12th International Conference on Information and Communication Systems (ICICS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Multi-Label Arabic Text Classification Based On Deep Learning\",\"authors\":\"Batool Alsukhni\",\"doi\":\"10.1109/ICICS52457.2021.9464538\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-label text classification is a natural extension of text classification in which each document can be assigned with a possible widespread set of labels. Natural Language Processing (NLP) helps to understand and manipulate text in natural language by using the computer. Arabic Text Classification is challenging recently because the Arabic language is under-resourced although it has many users. The aim of this paper is to build a model to classify Arabic news and help users get and display the most relevant news to their interests. In this paper, we demonstrate the efficiency of using deep learning models in solving Arabic multi-label text classification problem. Multilayer Perceptron (MLP) and Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM) have been used; we build two models via python. All data has been cleaned to improve the quality of experimental data. The result of test data in LSTM was 82.03 whereas in the MLP model was 80.37, and both models were evaluated using F1 score.\",\"PeriodicalId\":421803,\"journal\":{\"name\":\"2021 12th International Conference on Information and Communication Systems (ICICS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 12th International Conference on Information and Communication Systems (ICICS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICICS52457.2021.9464538\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 12th International Conference on Information and Communication Systems (ICICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICS52457.2021.9464538","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

多标签文本分类是文本分类的自然扩展,其中每个文档可以分配一组可能广泛的标签。自然语言处理(NLP)是一种利用计算机理解和处理自然语言文本的技术。阿拉伯语文本分类最近面临挑战,因为阿拉伯语虽然有许多用户,但资源不足。本文的目的是建立一个对阿拉伯语新闻进行分类的模型,帮助用户获取和显示最符合他们兴趣的新闻。在本文中,我们展示了使用深度学习模型解决阿拉伯语多标签文本分类问题的效率。多层感知器(MLP)和具有长短期记忆(LSTM)的递归神经网络(RNN)被应用;我们通过python构建两个模型。所有数据已经过清理,以提高实验数据的质量。LSTM模型的检验数据结果为82.03,MLP模型的检验数据结果为80.37,两种模型均采用F1评分进行评价。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multi-Label Arabic Text Classification Based On Deep Learning
Multi-label text classification is a natural extension of text classification in which each document can be assigned with a possible widespread set of labels. Natural Language Processing (NLP) helps to understand and manipulate text in natural language by using the computer. Arabic Text Classification is challenging recently because the Arabic language is under-resourced although it has many users. The aim of this paper is to build a model to classify Arabic news and help users get and display the most relevant news to their interests. In this paper, we demonstrate the efficiency of using deep learning models in solving Arabic multi-label text classification problem. Multilayer Perceptron (MLP) and Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM) have been used; we build two models via python. All data has been cleaned to improve the quality of experimental data. The result of test data in LSTM was 82.03 whereas in the MLP model was 80.37, and both models were evaluated using F1 score.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信