{"title":"Legal Judgment Prediction for Canadian Appeal Cases","authors":"Intisar Almuslim, D. Inkpen","doi":"10.1109/CDMA54072.2022.00032","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00032","url":null,"abstract":"Law is one of the knowledge domains that are most reliant on textual material. Nowadays, however, it is very difficult and time-consuming for legal professionals to read, understand, and analyze all the available documents, due to the vast volume of case law that is published every day. In this age of legal big data, and with the increased availability of legal text online, many researchers have given more focus to the development of legal intelligent systems and applications. These intelligent systems can provide great services and solve many problems in legal domain. Over the last years, researchers have focused on predicting judicial case outcomes using Natural Language Processing (NLP) and Machine Learning (ML) methods over case documents. Thus, Legal Judgment Prediction (LJP) is the task of automatically predicting the outcome of a court case given only the text of the case. To the best of our knowledge, no prior research with this intention has been conducted in English for appeal courts in Canada, as of 2021. The NLP application to legal judgments, that our proposed methodology focuses on, is to predict the outcomes of cases by looking only at the description of cases written by the court. Because appeal court decisions are often binary, as in accept or reject, the task is defined as a binary classification problem between’ Allow’ and ‘Dismiss'. This is the general approach in the literature as well. We employ various classification methods including classical classifiers, Deep Learning (DL) models, and compare their performances. Our best results are obtained using DL models with accuracy values reaching 93.46% and F1-scores reaching 0.92, which are on par with the best results in the literature. Through this study, we hope to establish the basis for future research on the legal system of Canada and offer a baseline for future work.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126773688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wajdi Aljedaani, Mohamed Wiem Mkaouer, S. Ludi, Yasir Javed
{"title":"Automatic Classification of Accessibility User Reviews in Android Apps","authors":"Wajdi Aljedaani, Mohamed Wiem Mkaouer, S. Ludi, Yasir Javed","doi":"10.1109/CDMA54072.2022.00027","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00027","url":null,"abstract":"In recent years, mobile applications have gained popularity for providing information, digital services, and content to users including users with disabilities. However, recent studies have shown that even popular mobile apps are facing issues related to accessibility, which hinders their usability experience for people with disabilities. For discovering these issues in the new app releases, developers consider user reviews published on the official app stores. However, it is a challenging and time-consuming task to identify the type of accessibility-related reviews manually. Therefore, in this study, we have used super-vised learning techniques, namely, Extra Tree Classifier (ETC), Random Forest, Support Vector Classification, Decision Tree, K-Nearest Neighbors (KNN), and Logistic Regression for automated classification of 2,663 Android app reviews based on four types of accessibility guidelines, i.e., Principles, Audio/Images, Design and Focus. Results have shown that the ETC classifier produces the best results in the automated classification of accessibility app reviews with 93% accuracy.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132274150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Phishing Attacks Detection using Machine Learning and Deep Learning Models","authors":"M. Aljabri, Samiha Mirza","doi":"10.1109/CDMA54072.2022.00034","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00034","url":null,"abstract":"Because of the fast expansion of internet users, phishing attacks have become a significant menace where the attacker poses as a trusted entity in order to steal sensitive data, causing reputational damage, loss of money, ransomware, or other malware infections. Intelligent techniques mainly Machine Learning (ML) and Deep Learning (D L) are increasingly applied in the field of cybersecurity due to their ability to learn from available data in order to extract useful insight and predict future events. The effectiveness of applying such intelligent approaches in detecting phishing web sites is investigated in this paper. We used two separate datasets and selected the highest correlated features which comprised of a combination of content-based, URL lexical-based, and domain-based features. A set of ML models were then applied, and a comparative performance evaluation was conducted. Results proved the importance of features selection in improving the models' performance. Furthermore, the results also aimed to identify the best features that influence the model in identifying phishing websites. For classification performance, Random Forest (RF) algorithm achieved the highest accuracy for both datasets.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116640498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation of Machine Learning to Early Detection of Highly Cited Papers","authors":"G. M. Binmakhashen, Hamdi A. Al-Jamimi","doi":"10.1109/CDMA54072.2022.00006","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00006","url":null,"abstract":"As one of the fastest-growing topics, machine learning has many applications that span through different domains including image and signal recognition, text mining, information retrieval, robotics, etc. It enables information extraction and analysis for better insights and decision-based systems. The Web of Science(WoS) citation database is a leading organization that provides citation data of high-quality published research. WoS has its metrics to label published articles as Highly Cited Paper(HCP). Machine learning (ML) can help researchers in identifying the key characteristics of HCP. Moreover, it can allow research evaluation units forecasting significant scientific articles. In other words, it may allow researchers and/or research evaluators to detect potential scientific breakthrough ideas and stay current. In this study, more than 26 thousand records of published articles indexed by WoS were analyzed. All the records are drawn from the Technology research area as defined by WoS. Four ML algorithms are evaluated to verify the HCP common factors influence in raising citations and interest in scientific articles. The ensemble algorithms show promising results to identify HCP articles using only four factors.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116951519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fahim Mahmud, Mahi Md. Sadek Rayhan, Mahdi Hasan Shuvo, Islam Sadia, Md. Kishor Morol
{"title":"A comparative analysis of Graph Neural Networks and commonly used machine learning algorithms on fake news detection","authors":"Fahim Mahmud, Mahi Md. Sadek Rayhan, Mahdi Hasan Shuvo, Islam Sadia, Md. Kishor Morol","doi":"10.1109/CDMA54072.2022.00021","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00021","url":null,"abstract":"Fake news on social media is increasingly regarded as one of the most concerning issues. Low cost, simple accessibility via social platforms, and a plethora of low-budget online news sources are some of the factors that contribute to the spread of false news. Most of the existing fake news detection algorithms are solely focused on the news content only but engaged users' prior posts or social activities provide a wealth of information about their views on news and have significant ability to improve fake news identification. Graph Neural Networks are a form of deep learning approach that conducts prediction on graph-described data. Social media platforms are followed graph structure in their representation, Graph Neural Network are special types of neural networks that could be usually applied to graphs, making it much easier to execute edge, node and graph-level prediction. Therefore, in this paper, we present a comparative analysis among some commonly used machine learning algorithms and Graph Neural Networks for detecting the spread of false news on social media platforms. In this study, we take the UPFD dataset and implement several existing machine learning algorithms on text data only. Besides this, we create different GNN layers for fusing graph-structured news propagation data and the text data as the node feature in our GNN models. GNNs provide the best solutions to the dilemma of identifying false news in our research.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131173958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Investigation of Forecasting Tadawul All Share Index (TASI) Using Machine Learning","authors":"G. M. Binmakhashen, A. Bakather, A. Bin-Salem","doi":"10.1109/CDMA54072.2022.00009","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00009","url":null,"abstract":"Stock markets are one of the most complex, and dynamic environments. To make predictions about the stock prices, we may require combining several sources of market information. Another possibility is to attempt to monitor and predict the stock index prices of a target market. In this study, we investigated several machine learning algorithms to predict the Saudi stock price index by utilizing Bloomberg's most used indicators. The collected data represents 26 years of Tadawul All Share Index(TASI) index prices. Several machine learning algorithms were investigated for forecasting midterm TASI index pricing. Two Recurrent Neural Network (RNN) architectures (deeper, and shallower architectures) were created, trained, tested, and their performances in forecasting TASI index prices are contrasted. Furthermore, several traditional machine learning methods such as Linear regression, decision trees, and random forests are also studied for index price prediction. The experiments suggested that with 26 years of TASI index transactions, simple machine learning(ML) models are generally suitable to make better midterm index price forecasting in comparison to more complex ML models.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"11 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124184796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Face-Swap-Verification Using PRNU","authors":"Ali Hassani, H. Malik","doi":"10.1109/CDMA54072.2022.00012","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00012","url":null,"abstract":"Facial recognition is becoming the go-to method of identifying users for convenience applications. While great advances have occurred in achieving strong false acceptance and false rejection rates on authentic images, these systems can be vulnerable to face-swap-attacks. This research addresses face-swap-attacks via camera forensics. Whenever an image is modified, there is necessarily an impact to the noise profile (in this case Photo Response Non-Uniformity). Hence, a framework is proposed to enroll the facial recognition camera's “noiseprint” and assess authenticity on future images based on deviation from expected value. This is done using down-sampling compression to improve run time, where images are further segmented into sub-zones to retain local sensitivity. Framework performance is evalu-ated by recording identical facial-images using multiple cameras of the same make. Next, a subset is modified via hand-crafted and AI-tool face-swaps. 100% of images are correctly identified as authentic or tampering when using full-image analysis at full-scale. Efficiency is then optimized by dividing the image into sub-zones and applying compression. Run-time is improved to 4.6 msec on CPU, a 99.1% reduction, by applying quarter-scale down-sampling with 16 sub-zones (this retains 93.5% verification accuracy). These results are validated against three existing state-of-the-art algorithms, which in comparison show over-fitting when compressed. This demonstrates that compressed PRNU can be used to efficiently verify facial-images, including against AI facial manipulation tools.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122161412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Deep Learning Framework for Temperature Forecasting","authors":"Patil Malini, B. Qureshi","doi":"10.1109/CDMA54072.2022.00016","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00016","url":null,"abstract":"Among many global warming issues, the increase in global temperatures causing summer heatwaves have triggered heat-strokes leading to untimely deaths of thousands of people. Heatwaves are meteorological events with prolonged periods of excessive heat. Machine learning algorithms such as Auto-Regressive Integrated Moving Average (ARIMA) and Ensemble-learning and Long Short-term Memory Network (LSTM) have recently been used to forecast weather conditions. Optimizing the hyperparameters for accurate temperature forecasting is challenging. This paper presents Cauchy Particle-swarm optimization (CPSO) technique for finding the hyperparameters of the LSTM. The proposed technique minimizes the validation mean square error rate (MSER) to improve accuracy. We test the proposed technique on 30-year Riyadh city temperature datasets. In our experimental evaluation, the proposed CPSO-LSTM outperforms LSTM and Grid-search LSTM by 50% and 55% respectively.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"117 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126939890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Capabilities of Quantum Machine Learning","authors":"Sarah Alghamdi, Sultan Almuhammadi","doi":"10.1109/CDMA54072.2022.00035","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00035","url":null,"abstract":"Machine learning techniques give impressive results in many areas. However, due to the physical limitation of integrated circuits which restricts their computational power growth, and the rapid advances in quantum computing, lots of research studies on quantum machine learning (QML) have been done recently. QML is a technique that uses quantum algorithms as parts of the implementation. Quantum algorithms use quantum mechanics and have the potential to outperform classical algorithms for a given problem. In this paper, three widely used machine learning algorithms are discussed and their quantum versions are presented, namely: quantum neural network, quantum autoencoder, and quantum kernel method. In addition, we discuss the potential capabilities of these QML algorithms and review recent work employing them. Moreover, a quantum neural network prototype is implemented using Qiskit as a proof of concept and tested on a real quantum computer. Empirical results show that quantum neural networks can be trained efficiently.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114948105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dina A. Alabbad, Nouha O. Alsaleh, Naimah A. Alaqeel, Yara A. Alshehri, Nashwa A. Alzahrani, Maha K. Alhobaishi
{"title":"A Robot-based Arabic Sign Language Translating System","authors":"Dina A. Alabbad, Nouha O. Alsaleh, Naimah A. Alaqeel, Yara A. Alshehri, Nashwa A. Alzahrani, Maha K. Alhobaishi","doi":"10.1109/CDMA54072.2022.00030","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00030","url":null,"abstract":"Services provided to deaf people in the Eastern province of Saudi Arabia were evaluated, which confirmed a high need to support the deaf community. This paper proposes utilizing the Pepper robot in the task of recognizing and translating Arabic sign language (ArSL), by which the robot recognizes static hand gestures of the letters in ArSL from each keyframe extracted from the input video and translate it into written text and vice versa. This project aims to conduct a two-way translation of the Arabic sign language in a way that fulfills the communication gap found in Saudi Arabia among deaf and non-deaf people. The methods proposed in this paper are computer vision to use the pepper robot's camera and sensors, Natural language processing to convert natural speech to sign language and Deep learning to build a convolutional neural network model that classifies the sign language gestures and convert them into their corresponding written and spoken form. Moreover, two datasets were used, first one is a collection of hand gestures for training the model and the other one is 39 animated signs of all the Arabic letters and special letters.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127570816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}