{"title":"Mowjaz多主题标签任务的MohammadHabash团队","authors":"Mohammad Habash","doi":"10.1109/ICICS52457.2021.9464614","DOIUrl":null,"url":null,"abstract":"Multi-label text classification is an important problem with the growing size of data and the difficulties in assigning a single label to each text sample because of the tendency of internet users to assign multiple labels to describe documents, emails, posts, etc. Our goal is to predict the category (topic) of an article given its text. The dataset which is used in this work contains articles from Mowjaz. Mowjaz is an Arabic topical content aggregation mobile application for news, sport, entertainment and other topics from top publishers that users can follow. This paper describes the approach to classify articles using Bi-directional Gated Recurrent Unit (Bi-GRU) with AraVec embeddings. The F1-score of this system is 0.8344 which shows a significant improvement over the baseline models.","PeriodicalId":421803,"journal":{"name":"2021 12th International Conference on Information and Communication Systems (ICICS)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Team MohammadHabash at Mowjaz Multi-Topic Labelling Task\",\"authors\":\"Mohammad Habash\",\"doi\":\"10.1109/ICICS52457.2021.9464614\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-label text classification is an important problem with the growing size of data and the difficulties in assigning a single label to each text sample because of the tendency of internet users to assign multiple labels to describe documents, emails, posts, etc. Our goal is to predict the category (topic) of an article given its text. The dataset which is used in this work contains articles from Mowjaz. Mowjaz is an Arabic topical content aggregation mobile application for news, sport, entertainment and other topics from top publishers that users can follow. This paper describes the approach to classify articles using Bi-directional Gated Recurrent Unit (Bi-GRU) with AraVec embeddings. The F1-score of this system is 0.8344 which shows a significant improvement over the baseline models.\",\"PeriodicalId\":421803,\"journal\":{\"name\":\"2021 12th International Conference on Information and Communication Systems (ICICS)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 12th International Conference on Information and Communication Systems (ICICS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICICS52457.2021.9464614\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 12th International Conference on Information and Communication Systems (ICICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICS52457.2021.9464614","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Team MohammadHabash at Mowjaz Multi-Topic Labelling Task
Multi-label text classification is an important problem with the growing size of data and the difficulties in assigning a single label to each text sample because of the tendency of internet users to assign multiple labels to describe documents, emails, posts, etc. Our goal is to predict the category (topic) of an article given its text. The dataset which is used in this work contains articles from Mowjaz. Mowjaz is an Arabic topical content aggregation mobile application for news, sport, entertainment and other topics from top publishers that users can follow. This paper describes the approach to classify articles using Bi-directional Gated Recurrent Unit (Bi-GRU) with AraVec embeddings. The F1-score of this system is 0.8344 which shows a significant improvement over the baseline models.