{"title":"Mitigating Web Scrapers using Markup Randomization","authors":"Noor Bolbol, T. Barhoom","doi":"10.1109/PICICT53635.2021.00038","DOIUrl":"https://doi.org/10.1109/PICICT53635.2021.00038","url":null,"abstract":"Web Scraping is the technique of extracting desired data in an automated way by scanning the internal links and content of a website, this activity usually performed by systematically programmed bots. This paper explains our proposed solution to protect the blog content from theft and from being copied to other destinations by mitigating the scraping bots. To achieve our purpose we applied two steps in two levels, the first one, on the main blog page level, mitigated the work of crawler bots by adding extra empty articles anchors among real articles, and the next step, on the article page level, we add a random number of empty and hidden spans with randomly generated text among the article's body. To assess this solution we apply it to a local project developed using PHP language in Laravel framework, and put four criteria that measure the effectiveness. The results show that the changes in the file size before and after the application do not affect it, also, the processing time increased by few milliseconds which still in the acceptable range. And by using the HTML-similarity tool we get very good results that show the symmetric over style, with a few bit changes over the structure. Finally, to assess the effects on the bots, scraper bot reused and get the expected results from the programmed middleware. These results show that the solution is feasible to be adopted and use to protect blogs content.","PeriodicalId":308869,"journal":{"name":"2021 Palestinian International Conference on Information and Communication Technology (PICICT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121241529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Binary Harris Hawks Optimisation Filter Based Approach for Feature Selection","authors":"Ruba Abu Khurma, M. Awadallah, Ibrahim Aljarah","doi":"10.1109/PICICT53635.2021.00022","DOIUrl":"https://doi.org/10.1109/PICICT53635.2021.00022","url":null,"abstract":"Feature Selection (FS) is a technique to reduce the dimensionality of datasets by eliminating irrelevant and redundant features to enhance the performance of the data mining tasks. Meta-heuristic algorithms are promising search engines to traverse the feature space to find a (near) optimal feature subset. Harris hawks optimization (HHO) algorithm is a recently developed meta-heuristic algorithm which is inspired from the hunting strategy of hawk in nature. The main contribution of this paper is that it proposes two new filter based methods for applying FS in classification problems. The methods integrate the information theory with an HHO algorithm. The first method applies the HHO with the mutual information between any two features. The second method applies the HHO with the entropy of each group of features. The adopted fitness function enhances the performance based on both the number of selected features and the classification accuracy. It gives different weights for relevance and redundancy. The results of the experiments show that with proper weights, the two proposed methods can significantly reduce the number of selected features and achieve a higher classification accuracy in most of the datasets. The first method usually selects a smaller feature subset, while the second method can achieve higher classification accuracy.","PeriodicalId":308869,"journal":{"name":"2021 Palestinian International Conference on Information and Communication Technology (PICICT)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126037457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Power and Subcarrier Allocation in Downlink NOMA Systems: Equal Power Allocation and DC Programing Approach","authors":"Hafeezul Haq, N. Taspinar","doi":"10.1109/PICICT53635.2021.00035","DOIUrl":"https://doi.org/10.1109/PICICT53635.2021.00035","url":null,"abstract":"Recently Non-Orthogonal Multiple Access (NOMA) has emerged as a suitable candidate for 5G systems due to its high spectral efficiency and potential to support the highly demanded massive connectivity for Future Radio Access networks. The main concept and difference of NOMA is to transmit multiple user's signal on the same sources such as frequency or time and differentiate the user's signals by splitting into power domain. Power allocation and subcarrier assignment are two main issues in NOMA. In this article, for maximization of the system total sum rate a joint subcarrier assignment and power allocation problem is contemplated. Our aim is to improve the system total throughput and also to maintain a high level of fairness between the users. The total bandwidth of the system is split into subcarriers and only two users are allocated to every subcarrier to minimize the complexity of the system. Two methods are investigated for power assignments to users. In the first method total power is distributed equally to sub channels as well as power on each sub channel is equally divided into users on that sub channel, whereas in second method, an algorithm based on Difference of Convex (DC) programing is proposed for power allocation. At last simulation results are observed and compared for the performance of the proposed methods.","PeriodicalId":308869,"journal":{"name":"2021 Palestinian International Conference on Information and Communication Technology (PICICT)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129142541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Capacity of FBMC/OQAM Transceiver System with SRRC Filter and Intrinsic Interference for 5G Wireless Communication System","authors":"Imad A. Shaheen, Loai Afana, A. AbuZaiter","doi":"10.1109/PICICT53635.2021.00036","DOIUrl":"https://doi.org/10.1109/PICICT53635.2021.00036","url":null,"abstract":"Filter Bank Multicarrier with Offset Quadrature Amplitude Modulation (FBMC/OQAM) has become the most widely adopted technology in next generation wireless communication system(5G). Moreover, The FBMC/OQAM system supports high data rate, low impulse noise and high bandwidth efficiency. In this paper, capacity of FBMC/OQAM transceiver system using square root raised cosine filter (SRRC) pulse shaping is analyzed through information theoretic. The FBMC systems adopt proper pulse shaping with good time and frequency localization properties to avoid interference and maintain orthogonally in the real field among subcarriers. Moreover, our analytical model is further extended in order to gain insight into the effect of the intrinsic interference in the performance of our system. Furthermore, the spectral efficiency of FBMC/OQAM system is analyzed when the effect of Intersymbol Interference (ISI) and Inter-Carrier Interference (ICI) is considered","PeriodicalId":308869,"journal":{"name":"2021 Palestinian International Conference on Information and Communication Technology (PICICT)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115079867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Alharan, Zahraa M. Algelal, Nabeel Salih Ali, Nora Al-Garaawi
{"title":"Improving Classification Performance for Diabetes with Linear Discriminant Analysis and Genetic Algorithm","authors":"A. Alharan, Zahraa M. Algelal, Nabeel Salih Ali, Nora Al-Garaawi","doi":"10.1109/PICICT53635.2021.00019","DOIUrl":"https://doi.org/10.1109/PICICT53635.2021.00019","url":null,"abstract":"In the modern-day, Diabetic disease is one of the most chronic and appalling diseases humanity faces. There are 463 million people had Diabetes worldwide, and it caused approximately 4.2 million deaths, according to the International Diabetes Federation (IDF) Diabetes Atlas Ninth edition 2019. Therefore diabetic patients need state-of-the-art healthcare against such diseases and propose early prediction to help decrease the risks related to such diseases. In this context, this research, a diabetes diagnosis system, has proposed to analyze two different diabetes datasets, namely PIMA Indian Diabetes and data of Dr. John Schorling. Linear Discriminant Analysis (LDA) and Genetic algorithm (GA) methods used for feature selection and four techniques implemented to evaluate the classification are Bagging algorithm, Random forest, Logistic Model Tree (LMT), and JRip algorithm. The results have shown that a random forest classifier using LDA and GA obtained better accuracy (90.89%) in DatasetI. At the same time, DatasetII is better than GA in Random forest, random forest-LDA, JRip-LDA classifiers (91.44%).","PeriodicalId":308869,"journal":{"name":"2021 Palestinian International Conference on Information and Communication Technology (PICICT)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125373227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Basheer Al-Sadawi, Ahmed Hussain, Nabeel Salih Ali
{"title":"High- Performance Printed Arabic Optical Character Recognition System Using ANN Classifier","authors":"Basheer Al-Sadawi, Ahmed Hussain, Nabeel Salih Ali","doi":"10.1109/PICICT53635.2021.00013","DOIUrl":"https://doi.org/10.1109/PICICT53635.2021.00013","url":null,"abstract":"Optical Character Recognition (OCR) systems were developed with high accuracy to facilitate transactions and increase human-computer interaction in most levels of government and commerce sectors. OCR has been adopted for diverse languages, but a few efforts have mainly been conducted in Arabic characters, and it has suffered weakness in the Arabic language. Therefore, a new Arabic Optical Character Recognition (AOCR) system is proposed to achieve a highperformance recognition system in printed images. Several steps are conducted to achieve the proposed AOCR system, such as image preprocessing, segmentation (line, words, and character), feature extraction, and classification. After evaluated the recognition system with multi-criteria, the AOCR results have shown accuracy with 95% with different quality document images (spatial resolution); besides, the system was adequate to resist the degradation of the documents, compared to other commercial systems in the literature.","PeriodicalId":308869,"journal":{"name":"2021 Palestinian International Conference on Information and Communication Technology (PICICT)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121505917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Message from the Program Chair","authors":"Wen-mei W. Hwu","doi":"10.1145/1394608.1382167","DOIUrl":"https://doi.org/10.1145/1394608.1382167","url":null,"abstract":"On behalf of the organizing committee and program committee, I welcome you to the 12th International Symposium on High Assurance Systems Engineering (HASE). The symposium is a forum for discussion of systems and software engineering approaches to achieving high assurance systems. It focuses on integrated approaches for assuring reliability, availability, integrity, privacy, confidentiality, safety, and real-time performance of complex systems; and methods for assessing assurance levels of these systems to a high degree of confidence. HASE 2010 is co-located with the 21st IEEE International Symposium on Software Reliability Engineering (ISSRE 2010), offering participants the opportunity to contribute to and benefit from two strong technical programs.","PeriodicalId":308869,"journal":{"name":"2021 Palestinian International Conference on Information and Communication Technology (PICICT)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126502471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Document Classification Based on Metadata and Keywords Extraction","authors":"Eman Y. Rezqa, R. Baraka","doi":"10.1109/PICICT53635.2021.00016","DOIUrl":"https://doi.org/10.1109/PICICT53635.2021.00016","url":null,"abstract":"We present a model for automatic extraction of metadata and keywords to be used in the classification of scientific documents. The model mainly consists of metadata extraction, keywords extraction and documents classification. At the metadata extraction stage, various metadata items are extracted from research documents in the domain of commerce such title of the thesis/research article, author/s, advisor/s, year, publisher, type, and abstract. At the keywords extraction stage, Latent Semantic Indexing (LSI) is used to extract the underlying topics from these documents. At the classification stage which depends on the metadata and keywords extraction stages, three classification algorithms are used which are Stochastic Gradient Descent (SGD), Linear Support Vector (LSVC) and K-Nearest Neighbor (KNN). SGD has achieved the highest classification accuracy (80.5%) compared to LSVC and KNN when applied to Arabic document corpus. LSVC has achieved the highest classification accuracy (81.5%) compared to SGD and KNN when applied to the English document corpus.","PeriodicalId":308869,"journal":{"name":"2021 Palestinian International Conference on Information and Communication Technology (PICICT)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125665118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ibrahim Nasser, Amjad H. Alzaanin, Ashraf Y. A. Maghari
{"title":"Online Recruitment Fraud Detection using ANN","authors":"Ibrahim Nasser, Amjad H. Alzaanin, Ashraf Y. A. Maghari","doi":"10.1109/PICICT53635.2021.00015","DOIUrl":"https://doi.org/10.1109/PICICT53635.2021.00015","url":null,"abstract":"Online recruitment provides job-seekers an efficient search and reach for jobs. It also helps recruiters searching for qualified candidates which improves the recruitment process. However, employment scam has emerged as a critical issue. Some job posts are legitimate, and others are fraud. In this paper, an Artificial Neural Network based model is proposed to detect fraud job posts. The public Employment Scam Aegean Dataset (EMSCAD) is used with proper text preprocessing techniques for training and testing the proposed model. Our model has precision, recall, and f-measure of 91.84%, 96.02%, and 93.88% respectively. The results show that the proposed ANN-based model outperforms similar existing models in detecting fraud jobs.","PeriodicalId":308869,"journal":{"name":"2021 Palestinian International Conference on Information and Communication Technology (PICICT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130604928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pointing Error Angle Effect on the Performance of 10 Gbps Ultra-Long Satellite Optical Wireless Communication","authors":"Fodil Ghali, B. Fassi, S. Driz","doi":"10.1109/PICICT53635.2021.00027","DOIUrl":"https://doi.org/10.1109/PICICT53635.2021.00027","url":null,"abstract":"This paper presents the performance analysis of 10 Gbps Low Earth Orbit (LEO) Inter-satellite Optical Wireless Communication (Is-OWC) system. In such systems, the propagation of light (visible or infrared) takes place in Free Space Optics (FSO), in order to transmit data between satellites in the same or different orbits. It is one of the important applications of FSO technology which will be widely deployed in the world in the future due to its many advantages offering high bandwidth, high data rate, small weight, low power and cost compared to the existing microwave satellite communications. However, the limit of this type of link is the difficulty of precise pointing between the transmitters and receivers due to satellites vibrations, which can cause a failure of reception of the laser beam. Here, the effect of pointing error angle on LEO satellite transmission performance as a function of transmitter output power, line coding techniques (Non Return to Zero, NRZ ; Return to Zero, RZ) and inter-satellite distance was analyzed using OptiSystem software simulation. The outcomes showed that the proposed system can accomplishes successfully transmission up to 4500 Km with an acceptable Bit Error Rate (BER) threshold.","PeriodicalId":308869,"journal":{"name":"2021 Palestinian International Conference on Information and Communication Technology (PICICT)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127618593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}