{"title":"阿拉伯语文本分类是一个解决的任务吗?","authors":"Anoual El Kah, Imad Zeroual","doi":"10.1109/ISCV54655.2022.9806076","DOIUrl":null,"url":null,"abstract":"The amount of Arabic content on the Internet is being proliferated day by day, scoring the highest growth among the top online languages during the last two decades. Arabic is the fourth most used language on the world wide web, making producing reliable data mining applications and information retrieval engines significantly challenging. Therefore, Arabic text categorization has gained the attention of different researchers from various fields. As a result, considerable research works have addressed Arabic text categorization. Some of these research works reported accuracy rates that were very close to 100%. Due to that, one can only think if Arabic text categorization is a solved task or is still under-studied? To answer this question, we screened 262 related papers based on eight former surveys and reviews, following the PRISMA-ScR guidelines. Then, we focused on top-ranked results that are over 95%. In this paper, we present the outcomes of our investigation by addressing several research questions regarding the datasets, the preprocessing techniques, the dimensionality reduction methods, and the classifiers that have been used.","PeriodicalId":426665,"journal":{"name":"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Is Arabic text categorization a solved task?\",\"authors\":\"Anoual El Kah, Imad Zeroual\",\"doi\":\"10.1109/ISCV54655.2022.9806076\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The amount of Arabic content on the Internet is being proliferated day by day, scoring the highest growth among the top online languages during the last two decades. Arabic is the fourth most used language on the world wide web, making producing reliable data mining applications and information retrieval engines significantly challenging. Therefore, Arabic text categorization has gained the attention of different researchers from various fields. As a result, considerable research works have addressed Arabic text categorization. Some of these research works reported accuracy rates that were very close to 100%. Due to that, one can only think if Arabic text categorization is a solved task or is still under-studied? To answer this question, we screened 262 related papers based on eight former surveys and reviews, following the PRISMA-ScR guidelines. Then, we focused on top-ranked results that are over 95%. In this paper, we present the outcomes of our investigation by addressing several research questions regarding the datasets, the preprocessing techniques, the dimensionality reduction methods, and the classifiers that have been used.\",\"PeriodicalId\":426665,\"journal\":{\"name\":\"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCV54655.2022.9806076\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCV54655.2022.9806076","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The amount of Arabic content on the Internet is being proliferated day by day, scoring the highest growth among the top online languages during the last two decades. Arabic is the fourth most used language on the world wide web, making producing reliable data mining applications and information retrieval engines significantly challenging. Therefore, Arabic text categorization has gained the attention of different researchers from various fields. As a result, considerable research works have addressed Arabic text categorization. Some of these research works reported accuracy rates that were very close to 100%. Due to that, one can only think if Arabic text categorization is a solved task or is still under-studied? To answer this question, we screened 262 related papers based on eight former surveys and reviews, following the PRISMA-ScR guidelines. Then, we focused on top-ranked results that are over 95%. In this paper, we present the outcomes of our investigation by addressing several research questions regarding the datasets, the preprocessing techniques, the dimensionality reduction methods, and the classifiers that have been used.