Reda A. Zayed, Mohamed Farouk Abdel Hady, H. Hefny
{"title":"Islamic Fatwa Request Routing via Hierarchical Multi-label Arabic Text Categorization","authors":"Reda A. Zayed, Mohamed Farouk Abdel Hady, H. Hefny","doi":"10.1109/ACLING.2015.28","DOIUrl":"https://doi.org/10.1109/ACLING.2015.28","url":null,"abstract":"Multi-label classification (MLC) is concerned withlearning from examples where each example is associatedwith a set of labels in opposite to traditional single-labelclassification where an example typically is assigned a single label. MLC problems appear in many areas, including text categorization, protein function classification, and semantic annotation of multimedia. The religious domain has become an interesting and challenging area for machine learning and natural language processing. A \"fatwa\" in the Islamic religion represents the legal opinion or interpretation that a qualified scholar (mufti) can give on issues related to the Islamic law. It is similar to the issue of legal opinions from courts in common-law systems. In this paper, a hierarchical classification system is introduced to automatically route incoming fatwa requests to the most relevant mufti. Each fatwa is associated to multiple categories by mufti where the categories can be organized in a hierarchy. The results on fatwa requests routing have confirmed the effective and efficient predictive performance of hierarchical ensembles of multi-label classifiers trained using the HOMER method and its variations compared to binary relevance which simply trains a classifier for each label independently.","PeriodicalId":404268,"journal":{"name":"2015 First International Conference on Arabic Computational Linguistics (ACLing)","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128081390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hao Wang, Vijay R. Bommireddipalli, Ayman Hanafy, Mohamed Bahgat, Sara Noeman, O. Emam
{"title":"A System for Extracting Sentiment from Large-Scale Arabic Social Data","authors":"Hao Wang, Vijay R. Bommireddipalli, Ayman Hanafy, Mohamed Bahgat, Sara Noeman, O. Emam","doi":"10.1109/ACLING.2015.17","DOIUrl":"https://doi.org/10.1109/ACLING.2015.17","url":null,"abstract":"Social media data in Arabic language is becoming more and more abundant. It is a consensus that valuable information lies in social media data. Mining this data and making the process easier are gaining momentum in the industries. This paper describes an enterprise system we developed for extracting sentiment from large volumes of social data in Arabic dialects. First, we give an overview of the Big Data system for information extraction from multilingual social data from a variety of sources. Then, we focus on the Arabic sentiment analysis capability that was built on top of the system including normalizing written Arabic dialects, building sentiment lexicons, sentiment classification, and performance evaluation. Lastly, we demonstrate the value of enriching sentiment results with user profiles in understanding sentiments of a specific user group.","PeriodicalId":404268,"journal":{"name":"2015 First International Conference on Arabic Computational Linguistics (ACLing)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122690478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying the Topic-Specific Influential Users Using SLM","authors":"M. Shalaby, Ahmed Rafea","doi":"10.1109/ACLING.2015.24","DOIUrl":"https://doi.org/10.1109/ACLING.2015.24","url":null,"abstract":"Social Influence can be described as the ability to have an effect on the thoughts or actions of others. The objective of this research is to investigate the use of language in detecting the influential users in a specific topic on Twitter. From a collection of tweets matching a specified query, we want to detect the influential users from the tweets' text. The study investigates the Arabic Egyptian dialect and if it can be used for detecting the author's influence. Using a Statistical Language Model, we found a correlation between the users' average Retweets counts and their tweets' perplexity, consolidating the hypothesis that SLM can be trained to detect the highly retweeted tweets. However, the use of the perplexity for identifying influential users resulted in low precision values. The simplistic approach carried out did not produce good results. There is still work to be done for the SLM to be used for identifying influential users.","PeriodicalId":404268,"journal":{"name":"2015 First International Conference on Arabic Computational Linguistics (ACLing)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122120993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fériel Ben Fraj Trabelsi, C. Ben Othmane Zribi, Wiem Kouki
{"title":"Combined Classification for Extracting Named Entities from Arabic Texts","authors":"Fériel Ben Fraj Trabelsi, C. Ben Othmane Zribi, Wiem Kouki","doi":"10.1109/ACLING.2015.15","DOIUrl":"https://doi.org/10.1109/ACLING.2015.15","url":null,"abstract":"In this paper, we describe an approach for extracting named entities from Arabic texts. Arabic language is hard to process since its characteristics that influence, even, the NE extraction. For our case, we consider that the named entities extraction can be assimilated to a typical classification problem. Indeed, this extraction consists of searching for text portions that can be classified in a NE class (Person, Locality or Organization). Thus, we choose to use a supervised learning approach and employ the BIO tagging format that can solve the twin problems of segmentation and categorization. In addition, singular classifier cannot give good results for all types of contexts. Thus, we adopt a set of weighted classifiers which we combined through a voting procedure. In order to appreciate properly the performance of our system, we perform two types of tests: with and without morphological attributes. We consider that the results are highly satisfactory especially with a accuracy that exceeds 89% for both Person and Locality classes.","PeriodicalId":404268,"journal":{"name":"2015 First International Conference on Arabic Computational Linguistics (ACLing)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127606808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enrichment of the Arabic Treebank ATB with Syntactic Properties","authors":"R. Bahloul, K. Haddar, P. Blache","doi":"10.1109/ACLING.2015.9","DOIUrl":"https://doi.org/10.1109/ACLING.2015.9","url":null,"abstract":"The enrichment of Arabic treebank with syntactic properties provides the increase of its use in different applications, the acquisition of new linguistic resources and the alleviation of the probabilistic parsing process by using statistics to limit the properties to satisfied ones. This method of enrichment requires two steps to follow starting by inducting a Property Grammar from a source treebank and generating finally the new syntactic property-based representation.","PeriodicalId":404268,"journal":{"name":"2015 First International Conference on Arabic Computational Linguistics (ACLing)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123094330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Hybrid Approach for Sentiment Classification of Egyptian Dialect Tweets","authors":"A. Shoukry, Ahmed Rafea","doi":"10.1109/ACLING.2015.18","DOIUrl":"https://doi.org/10.1109/ACLING.2015.18","url":null,"abstract":"Sentiment analysis has recently become one of the growing areas of research related to text mining and natural language processing. The main task of sentiment classification is to classify a sentence (i.e. tweet, review, blog, comment, news, etc.) as holding an overall positive, negative or neutral sentiment. Most of the current studies related to this topic focus mainly on English texts with very limited resources available for other languages like Arabic, especially for the Egyptian dialect. In this research work, we would like to improve the performance measures of Egyptian dialect sentence-level sentiment analysis by proposing a hybrid approach which combines both the machine learning approach using support vector machines and the semantic orientation approach. Two methodologies were proposed, one for each approach, which were then joined, creating the hybrid proposed approach. The results obtained show significant improvements in terms of the accuracy, precision, recall and F-measure, indicating that our proposed hybrid approach is effective in sentence-level sentiment classification. Also, the results are very promising which encourages continuing in this line of research.","PeriodicalId":404268,"journal":{"name":"2015 First International Conference on Arabic Computational Linguistics (ACLing)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129143096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adnan Souri, Mohammed Al Achhab, Badr Eddine El Mouhajir
{"title":"A Proposed Approach for Arabic Language Segmentation","authors":"Adnan Souri, Mohammed Al Achhab, Badr Eddine El Mouhajir","doi":"10.1109/ACLING.2015.13","DOIUrl":"https://doi.org/10.1109/ACLING.2015.13","url":null,"abstract":"This paper presents a research about natural language processing (NLP). Our area of interest is the process of Arabic text segmentation. Text segmentation is important step in any NLP. In this paper, we discuss several methods dealing mainly with cases of ambiguity of Arabic text segmentation. Several conclusions have been made. These conclusions lead to make a proposal of text segmentation. A vision based on connectors is developed.","PeriodicalId":404268,"journal":{"name":"2015 First International Conference on Arabic Computational Linguistics (ACLing)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127098625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tunisian Arabic aeb Wordnet: Current State and Future Extensions","authors":"Nadia Karmani Ben Moussa, Hsan Soussou, A. Alimi","doi":"10.1109/ACLING.2015.7","DOIUrl":"https://doi.org/10.1109/ACLING.2015.7","url":null,"abstract":"Nowadays, Internet communication and especially informal Internet communication such as social networks, blogs, etc. is directing politic, economic, financial and social environments all over the world. Consequently, Internet monitoring is taking more and more scale particularly in Tunisia suffering from unsteadiness since the politic revolution in 2011. In a Tunisian context, Internet communication is characterized by the increasing use of aeb language (i.e. an Arabic dialect called Tunisian Arabic). Therefore, Tunisian Internet monitoring needs primarily aeb language processing tools, especially an aeb lexicon. However, few aeb lexicon were developed seen the lack of written resources. Some of these lexicons are created from Arabic lexicons. They cover aeb lexicon originally Arabic and ignore the large borrowed aeb lexicon. Others are build using the informal Web. In fact, they need a rigorous linguistic verification, correction and validation. In this case, we suggest building a standard, large and robust Wordnet taking in charge phonetic. Our Wordnet is created by the expand approach used for EuroWordnet building as in [12], based on the bilingual English-Tunisian Arabic dictionary Peace corps dictionary prepared by the linguists: R. Ben abdelkader, A. Ayed and A. Naouar [13], and the last version of Princeton Wordnet PWN 3.1. Moreover, it is modelized according to ISO-LMF by a switable Wordnet-LMF model for aeb language. In this paper, we present aeb wordnet building approach, describe its current state and propose extensions.","PeriodicalId":404268,"journal":{"name":"2015 First International Conference on Arabic Computational Linguistics (ACLing)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115053181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Named Entities Recognition System for Modern Standard Arabic using Rule-Based Approach","authors":"Hala Elsayed, T. Elghazaly","doi":"10.1109/ACLING.2015.14","DOIUrl":"https://doi.org/10.1109/ACLING.2015.14","url":null,"abstract":"Named Entity Recognition (NER) is a task in Information Extraction (IE). The Named Entity Recognition has become very important for Natural Language Processing (NLP). In this paper, we designed a system which enhanced the named entities recognition for Arabic language where the system was developed for Arabic nouns and entities extractions. The nouns extraction system is based on Arabic morphological, the Arabic grammar rules a lot of them are not used before. The noun extraction in the system uses no gazetteers and the system is combined with entities extraction system depending on gazetteers. The system extracts noun according to morphological Arabic and classify them into proper nouns entities, title entities, currency entities, percentage entities, countries entities, cities entities, nationality entities, number entities, places entities, date entities and time entities. The system applied algorithms for generate nationality entities from countries entities, and the system applied Regular Expression (RE) for extract numbers in digit format. The system is not needed to normalization into the text before extraction process. The system tested text that is in the Modern Standard Arabic (MSA), the corpus is in open text. The system achieves results in an average recall of 85%.","PeriodicalId":404268,"journal":{"name":"2015 First International Conference on Arabic Computational Linguistics (ACLing)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116857028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lexicon Based and Multi-Criteria Decision Making (MCDM) Approach for Detecting Emotions from Arabic Microblog Text","authors":"Ahmad M. Abd Al-Aziz, M. Gheith, A. Eldin","doi":"10.1109/ACLING.2015.21","DOIUrl":"https://doi.org/10.1109/ACLING.2015.21","url":null,"abstract":"Emotions serve as a communicative function both within the brain and within the social group. Most of previous opinion mining studies applied on Arabic microblog text to identify positive, negative or neutral polarity. This paper studies the problem of detecting multiple emotion classes in Arabic microblog text (e.g. Twitter). Incoming Arabic microblog text is classified into one of fine grained emotional classes {happiness, sadness, fear, anger, disgust or none} if exists or mixed emotion if text contains multiple emotions e.g. {Happiness/Fear} or {Anger/Disgust}. We applied a combined approach of lexicon approach and Multi-Criteria Decision Making approach. We use a conditioned plot to classify and analyze the text by generating a two dimensional graphic analysis space, one dimension represents observations (tweets) and the other represents our variables (5 emotional scores). The experimental results show that our proposed approach by using the conditioned plot able to classify text into different fine grained emotions, and also able to classify Arabic text with mixed emotions.","PeriodicalId":404268,"journal":{"name":"2015 First International Conference on Arabic Computational Linguistics (ACLing)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131028983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}