Taoufiq Zarra, R. Chiheb, Rajae Moumen, R. Faizi, A. E. Afia
{"title":"Topic and sentiment model applied to the colloquial Arabic: a case study of Maghrebi Arabic","authors":"Taoufiq Zarra, R. Chiheb, Rajae Moumen, R. Faizi, A. E. Afia","doi":"10.1145/3128128.3128155","DOIUrl":null,"url":null,"abstract":"Recently, the multiplication of communication and sharing platforms such as social networks, personal blogs, forums, etc., has facilitated the expression of views and opinions about products, personalities, and public policy. However, gathering these points of view is a complex task that requires resolution of many problems in different disciplines, especially issues related to our language. Among the research areas, topic modeling and sentiment analysis stimulates interest and curiosity of the scientific community. Lately, the current economic, geo-political and geostrategic trends have made researchers specifically more interested in Arabic language, except that the majority of these studies focus on the classical Arabic; nevertheless it is a language of the elites which is different from what is mainly used on the Web. Our paper focuses on Maghrebi colloquial Arabic since the little research that exists in this area is limited to East colloquial Arabic. On a corpus extracted from different Facebook pages we implemented a supervised approach to extract the sentiments, and an unsupervised approach to extract topic, then we proposed a new, semi-supervised, approach in the Arabic language that combines the topic and the sentiment in a single model, in order to join each topic to a specific sentiment.","PeriodicalId":362403,"journal":{"name":"Proceedings of the 2017 International Conference on Smart Digital Environment","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 International Conference on Smart Digital Environment","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3128128.3128155","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
Recently, the multiplication of communication and sharing platforms such as social networks, personal blogs, forums, etc., has facilitated the expression of views and opinions about products, personalities, and public policy. However, gathering these points of view is a complex task that requires resolution of many problems in different disciplines, especially issues related to our language. Among the research areas, topic modeling and sentiment analysis stimulates interest and curiosity of the scientific community. Lately, the current economic, geo-political and geostrategic trends have made researchers specifically more interested in Arabic language, except that the majority of these studies focus on the classical Arabic; nevertheless it is a language of the elites which is different from what is mainly used on the Web. Our paper focuses on Maghrebi colloquial Arabic since the little research that exists in this area is limited to East colloquial Arabic. On a corpus extracted from different Facebook pages we implemented a supervised approach to extract the sentiments, and an unsupervised approach to extract topic, then we proposed a new, semi-supervised, approach in the Arabic language that combines the topic and the sentiment in a single model, in order to join each topic to a specific sentiment.