{"title":"Towards a Psycholinguistic Database of Arabic","authors":"N. Fathy, S. Alansary","doi":"10.1109/ESOLEC54569.2022.10009144","DOIUrl":"https://doi.org/10.1109/ESOLEC54569.2022.10009144","url":null,"abstract":"Psycholinguistic databases are indispensable resources for psycholinguistic and computational research. Many languages have such valuable resources, such as English, Croatian, Dutch, French, and Chinese. Unfortunately, Arabic doesn't have such databases. This research aims at introducing the guidelines for building a psycholinguistic database of Arabic. The database will be available in two phases: the first is a psycholinguistic phase in which subjective ratings are collected for several variables such as concreteness, imageability, subjective frequency, and number of meanings, the second is a computational phase in which ratings are stacked with other linguistic information obtained from corpora, such as root, stem, objective frequency, number of syllables, and word length. This phase is meant to provide an online searchable release that can be used by psycholinguists and computational linguists for building cognitive-based artificial intelligence models. This survey is meant to introduce the building process of the psycholinguistic phase in detail.","PeriodicalId":179850,"journal":{"name":"2022 20th International Conference on Language Engineering (ESOLEC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115570494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Learning in Arabic Text Summarization: Approaches, Datasets, and Evaluation Metrics","authors":"Yasmin Einieh, Amal Almansour","doi":"10.1109/ESOLEC54569.2022.10009528","DOIUrl":"https://doi.org/10.1109/ESOLEC54569.2022.10009528","url":null,"abstract":"Recently, there is a massive amount of data available on the internet. Hence, it is quite difficult for the users to go through all the available online information to generate a precise summary manually. Automatic Text Summarization (ATS) systems provide a solution to this problem as they produce a shorter and manageable version of the input text while keeping the most important information. Deep learning has achieved good results in Natural Language Processing (NLP) tasks and the use of deep learning techniques specifically in Automatic Text Summarization (ATS) has increased in English language. However, there is still a shortage of studies evaluating these techniques in Arabic language. In this research work, we review several articles that address the usage of deep learning with Arabic language. Specifically, we study the available models, datasets, and evaluation metrics for extractive and abstractive Arabic text summarization. We reviewed 12 research papers and found that most of the studies employed deep learning for the abstractive summarization type.","PeriodicalId":179850,"journal":{"name":"2022 20th International Conference on Language Engineering (ESOLEC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117239958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Arabic Sentences Semantic Similarity Based on Word Embedding","authors":"Badrya Dahy, M. Farouk, Khaled Fathy","doi":"10.1109/ESOLEC54569.2022.10009099","DOIUrl":"https://doi.org/10.1109/ESOLEC54569.2022.10009099","url":null,"abstract":"Natural language processing pays significant attention to semantic textual similarity. It's useful in a variety of NLP-applications, including information retrieval, plagiarism detection, data extraction, and machine translation. Sentence similarity in the Arabic language has not been investigated deeply because of the lack of Arabic language resources. Moreover, it's critical to calculate the degree of similarity between Arabic sentences accurately. The method for determining the semantic similarity of Arabic sentences is suggested in this research. The strategy suggested uses word embedding to measure the similarity between words. Moreover, more than one similarity measure is combined to calculate the final similarity. Furthermore, due to the lack of Arabic resources, a new dataset for evaluating similarity techniques has been constructed. The new dataset is available for public use. An experiment have been conducted to show the efficiency of the strategy suggested. Two datasets are used to compare other approaches. Experiments reveal that the proposed methods outperform alternative approaches to measuring sentence similarity in the Arabic language.","PeriodicalId":179850,"journal":{"name":"2022 20th International Conference on Language Engineering (ESOLEC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116849632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abdallah Gomaa, Omar Rashed, Abdelkarim Refaey, Abdel-rahman Mohamed, M. Sayed, M. Rashwan
{"title":"A new framework for an eKYC system","authors":"Abdallah Gomaa, Omar Rashed, Abdelkarim Refaey, Abdel-rahman Mohamed, M. Sayed, M. Rashwan","doi":"10.1109/ESOLEC54569.2022.10009253","DOIUrl":"https://doi.org/10.1109/ESOLEC54569.2022.10009253","url":null,"abstract":"Identity verification has long been a crucial problem to solve to automate financial operations which requires user authentication and detect fraudulency. Until recently the realization of this task was nearly impossible to do with considerable accuracy, thanks to advancements in machine learning over the past few years we can achieve this task. This paper will discuss a proposed solution for high accuracy, high-performance eKYC system. In an eKYC system, we need to verify our client's identity as per his identity documents with the constraint that he passed a liveness detection test to ensure he is doing the financial operation in person. In our proposed system, verification is done in three main stages, which are: face detection, face verification, and face antispoofing detection. We employed an AI model to perform each task, We employed MTCNN [1] for face detection and FaceNet [12] for face verification. For face antispoofing, we implemented a state-of-the-art model PatchNet [15].","PeriodicalId":179850,"journal":{"name":"2022 20th International Conference on Language Engineering (ESOLEC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128327939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recommender Diagnosis System with Fuzzy Logic in Cloud Environment","authors":"Maie Aboghazalah, Rasha Elnemr, Nedaa Elsayed, Ayman El-Sayed, Passant El-Kafrawy","doi":"10.1109/ESOLEC54569.2022.10009214","DOIUrl":"https://doi.org/10.1109/ESOLEC54569.2022.10009214","url":null,"abstract":"Recommendation systems are now used in a wide range in many fields. In the medical field, recommendation systems have a great stature to both doctors and patients for its accurate prediction. It can reduce the time and efforts spent by doctors and patients. The present work introduces a simple and effective methodology for medical recommendation system based on fuzzy logic. Fuzzy logic is an important method to be used based on fuzzy input data. The input data for each patient are not the same, on which recommendation can differ. This work aims to develop techniques for handling the patient data to urge accurate lifestyle recommendations to the patient. Fuzzy logic is utilized to form different recommendations for the patient like lifestyle recommendations, medicine recommendations, and sports recommendations based on different patient factors like age, gender and patient diseases. After evaluating the system its efficiency reached 94%. This Experiment is the final module in a four modules recommendation system. The first one is responsible for diagnosing chest diseases using ECG signals. The second one makes diagnosis using X-ray images. The third is utilizing the security of the whole system through encryption when sending user data over the cloud.","PeriodicalId":179850,"journal":{"name":"2022 20th International Conference on Language Engineering (ESOLEC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127782064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and Implementation of a Dockerized, Cross Platform, Multi-Purpose Cryptography as a Service Framework Featuring Scalability, Extendibility and Ease of Integration","authors":"A. Merdan, H. Aslan, Nashwa Abdelbaki","doi":"10.1109/ESOLEC54569.2022.10009317","DOIUrl":"https://doi.org/10.1109/ESOLEC54569.2022.10009317","url":null,"abstract":"Following cybersecurity standards nowadays is becoming one of the highest priorities to the digital specialists. Due to the global direction to apply digital transformation, data security is a concern. It becomes crucial to ensure data confidentiality, integrity, and availability whether while transmitting, at rest or even while processing it. The difficulty being faced by organizations, is the challenge of applying the needed security measures. Also, implementing, and maintaining the cryptographic algorithms that ensure the wellness of the data encryption. Having a crypto library or a server that can fit multiple use-cases is either too costly to implement, or expensive to buy (including licensing options, per user/server/year…etc.). The goal of our work is to identify the data protection challenges, by implementing a solution that could match a theoretical hypothesis of having cryptography as a service framework. The term “as a service” has been promoted lately due to its capabilities to provide a ready-made solution by the vendors to satisfy their customer base. In this paper, we are proposing a framework that works cross-platform with ease. It is a scalable, extendible solution with multiple hosting options, from an on-premises hosting to cloud hosting. The proposed framework is implemented and evaluated. The results show that the proposed framework can efficiently process enormous amounts of data. In addition, it could be easily accessed by standard HTTPS requests using JSON format. Also, proving the used deployment technique, we were able to evaluate it on-premises and on cloud with the same allocated resources, getting matching results.","PeriodicalId":179850,"journal":{"name":"2022 20th International Conference on Language Engineering (ESOLEC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134263651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Knowledge Graph Embeddings in Embedding Based Recommender Systems","authors":"Ahmed Hussein Ragab, Passant El-Kafrawy","doi":"10.1109/ESOLEC54569.2022.10009491","DOIUrl":"https://doi.org/10.1109/ESOLEC54569.2022.10009491","url":null,"abstract":"This paper proposes using entity2rec [1] which utilizes knowledge graph-based embeddings (node2vec) instead of traditional embedding layers in embedding based recommender systems. This opens the door to increasing the accuracy of some of the most implemented recommender systems running in production in many companies by just replacing the traditional embedding layer with node2vec graph embedding without the risk of completely migrating to newer SOTA systems and risking unexpected performance issues. Also, Graph embeddings will be able to incorporate user and item features which can help in solving the well-known Cold start problem in recommender systems. Both embedding methods are compared on the movie-Lens 100-K dataset in an item-item collaborative filtering recommender and we show that the suggested replacement improves the representation learning of the embedding layer by adding a semantic layer that can increase the overall performance of the normal embedding based recommenders. First, normal Recommender systems are introduced, and a brief explanation of both traditional and graph-based embeddings is presented. Then, the proposed approach is presented along with related work. Finally, results are presented along with future work.","PeriodicalId":179850,"journal":{"name":"2022 20th International Conference on Language Engineering (ESOLEC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133733125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gold Price Prediction using Sentiment Analysis","authors":"Mariam Abdou, Menna Shaltout, Alaa Godah, Karim Sobh, Yomna Eid, Walaa Medhat","doi":"10.1109/ESOLEC54569.2022.10009529","DOIUrl":"https://doi.org/10.1109/ESOLEC54569.2022.10009529","url":null,"abstract":"Gold is one of the valuable materials that is used for funding trading purchases. Nowadays, more investors are interested in gold investments due to the sudden increase in gold prices. However, transactions involving gold are risky, the price of gold fluctuates wildly due to the unpredictability of the gold market. Hence, there is a need for the development of gold price prediction scheme to assist and support investors, marketers, and financial institutions in making effective economic and monetary decisions. This paper analyzes the correlation between gold price movements and sentiments of Arabic tweets in Egypt. After performing sentiment analysis on these tweets, three supervised machine learning algorithms were used for predicting the gold price. The algorithms include Multiple linear regression, Ridge regression, and Lasso regression. The result of this work shows that the Lasso regression model performs better than the other two models. However, it is concluded that there is a weak correlation between gold prices and Twitter data. Therefore, gold prices cannot be accurately predicted using Twitter data alone.","PeriodicalId":179850,"journal":{"name":"2022 20th International Conference on Language Engineering (ESOLEC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126514827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ibrahim Ahmed, Mostafa Abbas, Rany Hatem, Andrew Ihab, Mohamed Waleed Fahkr
{"title":"Fine-tuning Arabic Pre-Trained Transformer Models for Egyptian-Arabic Dialect Offensive Language and Hate Speech Detection and Classification","authors":"Ibrahim Ahmed, Mostafa Abbas, Rany Hatem, Andrew Ihab, Mohamed Waleed Fahkr","doi":"10.1109/ESOLEC54569.2022.10009167","DOIUrl":"https://doi.org/10.1109/ESOLEC54569.2022.10009167","url":null,"abstract":"Offensive language and Hate Speech are rampant on social media platforms (Facebook, Twitter, etc.) in Egypt for quite a while now, appearing in Tweets, Facebook posts and comments, etc., It is an increasingly outreaching problem that needs immediate attention. This paper focuses on the problem of detecting and classifying both offensive language and Hate Speech using State-of-the-art techniques in text classification. Pre-trained transformer models have gained a reputation of astounding general language understanding that could be fine-tuned for language-specific tasks like Text classification, We collected an Egyptian-Arabic dialect Custom dataset of about 8,000 text samples manually labelled into 5 distinct classes: (Neutral, Offensive, Sexism, Religious Discrimination, Racism), It was used to fine-tune and evaluate multiple different Arabic pre-trained transformer models based on different transformer architectures and pre-training approaches for the Natural Language Processing downstream task of text classification. We achieved an average accuracy of about 96% across all fine-tuned transformer models.","PeriodicalId":179850,"journal":{"name":"2022 20th International Conference on Language Engineering (ESOLEC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124073685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hassanin M. Al-Barhamtoshy, Ashraf Said Qutb Metwalli
{"title":"Neural Networks for Bilingual Machine Translation Model","authors":"Hassanin M. Al-Barhamtoshy, Ashraf Said Qutb Metwalli","doi":"10.1109/ESOLEC54569.2022.10009266","DOIUrl":"https://doi.org/10.1109/ESOLEC54569.2022.10009266","url":null,"abstract":"Machine translation can be involved in statistical-based, corpus-based or dataset-based machine translation systems, in addition to linguistic systems. This paper objects to develop a bilingual English to Arabic translation model with quality for continuous improvement and flexible to be expanded multi-lingual other language pairs. This in addition to create an integrated translation environment that incorporates computer-assisted facilities to enhance the quality of automatically produced texts, increase translators' productivity and help their professional capabilities. Therefore, a machine translation model based on neural networks will be developed. Consequently, bilingual dictionaries will be involved, after cleaning and removing non-alphanumeric texts using linguistic modification tasks for the proposed machine translation model. Therefore, encoder and decoder models are involved for such machine translation. Finally, the training model is used to inference on new input to translate and therefore, the testing phase of the proposed machine translation model will be evaluated.","PeriodicalId":179850,"journal":{"name":"2022 20th International Conference on Language Engineering (ESOLEC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115819951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}