{"title":"Improving the Classification Effectiveness of Network Intrusion Detection Using Ensemble Machine Learning Techniques and Deep Neural Networks","authors":"Yunpeng Zhang, Yash Gandhi, Zhixia Li, Zhiwen Xiao","doi":"10.1109/IDSTA55301.2022.9923205","DOIUrl":"https://doi.org/10.1109/IDSTA55301.2022.9923205","url":null,"abstract":"Sophisticated cyber-attacks and ever-evolving threats have made securing networks highly complex due to the advent of Big data and Connected systems, and inaccuracy and incompetency of current Network Intrusion Detection Systems (NIDS). This poses a need for better network intrusion detection models to enhance network security and secure communication channels in the future. Over the years, machine learning and deep learning models have proven to be effective in detecting network intrusion and classification of attacks on networks. In this paper, we present our proposed NIDS based on machine learning and deep learning techniques to enhance the performance of current network intrusion detection systems. Decision tree, ensemble machine learning techniques like Random Forest and XGBoost, and Deep Neural Networks (DNN) have been used on the modern substitutes of the benchmark KDD CUP 99 dataset, the NSL KDD, and the UNSW NB-15. We apply unique feature selection methods and achieve competitive results. For Binary Classification, the results show that our models achieve high accuracies of more than 99.25% for the NSL KDD dataset and above 93% for UNSW NB15 dataset. For Multiclass Classification, our models achieve accuracies of more than 97.70% for NSL KDD and above S2.50% for the UNSW NB15 dataset.","PeriodicalId":268343,"journal":{"name":"2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA)","volume":"188 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122879410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dana Alsagheer, Hadi Mansourifar, Mohammad Mahdi Dehshibi, W. Shi
{"title":"Detecting Hate Speech Against Athletes in Social Media","authors":"Dana Alsagheer, Hadi Mansourifar, Mohammad Mahdi Dehshibi, W. Shi","doi":"10.1109/IDSTA55301.2022.9923132","DOIUrl":"https://doi.org/10.1109/IDSTA55301.2022.9923132","url":null,"abstract":"When English clubs and the game’s governing bodies and organizations turned off their Facebook, Twitter, and Instagram accounts from April 30 to May 1, 2021, the fight against online racism regained a new momentum. However, the Tokyo Olympics revealed new aspects of online bullying that athletes may face during major sporting events. Despite the significant effort put into online hate speech detection research in general, hate speech detection against athletes requires a separate investigation. We show in this paper that abusive language directed at athletes is more varied and difficult to detect. We began with the introduction of the collected data from online comments aimed at three athletes competing in the Tokyo Olympics 2020. Followed by conducting an extensive classification experiments of the collected data to demonstrate its diversity in comparison to other hate speech datasets. This was done to demonstrate that Active Learning outperforms Supervised Learning in hate speech detection against athletes.","PeriodicalId":268343,"journal":{"name":"2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133468726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Naresh Kumar, Abdul Khadar Jilani, Pavan Kumar, Anastasija Nikiforova
{"title":"Improved YOLOv3-tiny Object Detector with Dilated CNN for Drone-Captured Images","authors":"Naresh Kumar, Abdul Khadar Jilani, Pavan Kumar, Anastasija Nikiforova","doi":"10.1109/IDSTA55301.2022.9923041","DOIUrl":"https://doi.org/10.1109/IDSTA55301.2022.9923041","url":null,"abstract":"The research problems on Object detection have been attracted with major issues in the computer vision domain. Object detection based on images from unmanned aerial vehicles (UAV) - drones, has versatile applications in both defence security, agriculture and GIS. However, real-time object detection in UAV scenarios remains quite a tedious problem due to environmental obstructions such as occlusion and view-invariant conditions despite the high number of solutions proposed to solve this task. This paper proposes an improved YOLOv3-tiny object detector by introducing a multi-dilated module between the convolution unit and the receptive field, where the problem of a small number of positive training samples is solved by a larger size of the predicted feature map thereby reducing the rate of label rewriting in YOLOv3-tiny. We find that the fusion of multi-scale receptive fields is effective in detecting even every single tiny object. We introduce a path aggregation module that merges the semantic information in a deeper layer and detailed information in an earlier layer. The analysis of the proposed solution shows that on the VisDrone2019-Det test set, our proposed model is more efficient and effective, running 2.96% times faster and increasing 4.0% AP50 than YOLOv3.","PeriodicalId":268343,"journal":{"name":"2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116091753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EEG-based Image Feature Extraction for Visual Classification using Deep Learning","authors":"Alankrit Mishra, N. Raj, Garima Bajwa","doi":"10.1109/IDSTA55301.2022.9923087","DOIUrl":"https://doi.org/10.1109/IDSTA55301.2022.9923087","url":null,"abstract":"While capable of segregating visual data, humans take time to examine a single piece, let alone thousands or millions of samples. The deep learning models efficiently process sizeable information with the help of modern-day computing. However, their questionable decision-making process has raised considerable concerns. Recent studies have identified a new approach to extract image features from EEG signals and combine them with standard image features. These approaches make deep learning models more interpretable and also enables faster converging of models with fewer samples. Inspired by recent studies, we developed an efficient way of encoding EEG signals as images to facilitate a more subtle understanding of brain signals with deep learning models. Using two variations in such encoding methods, we classified the encoded EEG signals corresponding to 39 image classes with a benchmark accuracy of 70% on the layered dataset of six subjects, which is significantly higher than the existing work. Our image classification approach with combined EEG features achieved an accuracy of 82% compared to the slightly better accuracy of a pure deep learning approach; nevertheless, it demonstrates the viability of the theory.","PeriodicalId":268343,"journal":{"name":"2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116142075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Repurposing Knowledge Graph Embeddings for Triple Representation via Weak Supervision","authors":"Alexander Kalinowski, Yuan An","doi":"10.1109/IDSTA55301.2022.9923036","DOIUrl":"https://doi.org/10.1109/IDSTA55301.2022.9923036","url":null,"abstract":"The majority of knowledge graph embedding techniques treat entities and predicates as separate embedding matrices, using aggregation functions to build a representation of the input triple. However, these aggregations are lossy, i.e. they do not capture the semantics of the original triples, such as information contained in the predicates. To combat these shortcomings, current methods learn triple embeddings from scratch without utilizing entity and predicate embeddings from pre-trained models. In this paper, we design a novel fine-tuning approach for learning triple embeddings by creating weak supervision signals from pre-trained knowledge graph embeddings. We develop a method for automatically sampling triples from a knowledge graph and estimating their pairwise similarities from pre-trained embedding models. These pairwise similarity scores are then fed to a Siamese-like neural architecture to fine-tune triple representations. We evaluate the proposed method on two widely studied knowledge graphs and show consistent improvement over other state-of-the-art triple embedding methods on triple classification and triple clustering tasks.","PeriodicalId":268343,"journal":{"name":"2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126100549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Digital health shopping assistant with React Native: a simple technological solution to a complex health problem","authors":"A.A. Govoruhina, Anastasija Nikiforova","doi":"10.1109/IDSTA55301.2022.9923047","DOIUrl":"https://doi.org/10.1109/IDSTA55301.2022.9923047","url":null,"abstract":"Today, more and more people are reporting allergies, which can range from simple reactions / discomfort to anaphylactic shocks. Other people may not be allergic but avoid certain foods for personal reasons. Daily food shopping of these people is hampered by the fact that unwanted ingredients can be hidden in any food, and it is difficult to find them all. The paper presents a digital health shopping assistant called “Diet Helper”, aimed to make life easier for such people by making it easy to determine whether a product is suitable for consumption, according to the specific dietary requirements of both types - existing diet and self-defined. This is achieved by capturing ingredient label, received by the app as an input, which the app analyses, converting the captured label to text, and filters out unwanted ingredients that according to the user should be avoided as either allergens or products to which the consumer is intolerant, helping the user decide if the product is suitable for consumption. This should make daily grocery shopping easier by providing the user with more accurate and simplified product selection in seconds, reducing the total time spent in the grocery stores, which is especially relevant in light of COVID-19, although it was and will remain out of it due to the busy schedules and active rhythm of life of modern society. The app is developed using the React Native framework and Google Firebase platform, which makes it easy to develop, use and extend such solutions thereby encouraging to start actively developing solutions that could improve wellbeing.","PeriodicalId":268343,"journal":{"name":"2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134029717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrating Human-in-the-loop into Swarm Learning for Decentralized Fake News Detection","authors":"Xishuang Dong, Lijun Qian","doi":"10.1109/IDSTA55301.2022.9923043","DOIUrl":"https://doi.org/10.1109/IDSTA55301.2022.9923043","url":null,"abstract":"Social media has become an effective platform to generate and spread fake news that can mislead people and even distort public opinion. Centralized methods for fake news detection, however, cannot effectively protect user privacy during the process of centralized data collection for training models. Moreover, it cannot fully involve user feedback in the loop of learning detection models for further enhancing fake news detection. To overcome these challenges, this paper proposed a novel decentralized method, Human-in-the-loop Based Swarm Learning (HBSL), to integrate user feedback into the loop of learning and inference for recognizing fake news without violating user privacy in a decentralized manner. It consists of distributed nodes that are able to independently learn and detect fake news on local data. Furthermore, detection models trained on these nodes can be enhanced through decentralized model merging. Experimental results demonstrate that the proposed method outperforms the state-of-the-art decentralized method in regard of detecting fake news on a benchmark dataset.","PeriodicalId":268343,"journal":{"name":"2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128747814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Chowdhury, Yuxiao Lin, Bor-Shuang Liaw, L. Kerby
{"title":"Evaluation of Tree Based Regression over Multiple Linear Regression for Non-normally Distributed Data in Battery Performance","authors":"S. Chowdhury, Yuxiao Lin, Bor-Shuang Liaw, L. Kerby","doi":"10.1109/IDSTA55301.2022.9923169","DOIUrl":"https://doi.org/10.1109/IDSTA55301.2022.9923169","url":null,"abstract":"Battery performance datasets are typically non-normal and multicollinear. Extrapolating such datasets for model predictions needs attention to such characteristics. This study explores the impact of data normality in building machine learning models. In this work, tree-based regression models and multiple linear regressions models are each built from a highly skewed non-normal dataset with multicollinearity and compared. Several techniques are necessary, such as data transformation, to achieve a good multiple linear regression model with this dataset; the most useful techniques are discussed. With these techniques, the best multiple linear regression model achieved an $R^{2} = 81. 23%$ and exhibited no multicollinearity effect for the dataset used in this study. Tree-based models perform better on this dataset, as they are non-parametric, capable of handling complex relationships among variables and not affected by multicollinearity. We show that bagging, in the use of Random Forests, reduces overfitting. Our best tree-based model achieved accuracy of $R^{2} =97.73%$. This study explains why tree-based regressions promise as a machine learning model for non-normally distributed, multicollinear data.","PeriodicalId":268343,"journal":{"name":"2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115371220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint SCSP-LROM: A novel approach to detect Cerebrovascular Anomalies from EEG signals","authors":"Debojyoti Seth","doi":"10.1109/IDSTA55301.2022.9923032","DOIUrl":"https://doi.org/10.1109/IDSTA55301.2022.9923032","url":null,"abstract":"Electroencephalography (EEG) gained popularity over similar modalities like Functional Magnetic Resonance Imaging (fMRI) or Functional Near-Infrared Spectroscopy (fNRIS), for being simplistic and non-invasive. One of the biggest challenges of any Brain Computer Interfacing (BCI) techniques, is recovering maximum information from minimal input channels for realistic predictions. To choose EEG channels with highest accuracy, a novel concept of introducing sparsity in a Convolutional Neural Network (CNN) induced modified Common Spatial Pattern (CSP) algorithm is introduced in this paper. This approach helps developing optimized confusion matrices, which can extensively label the feature map in significantly lower number of iterations, to predict trends of growth of symptoms. The concept of compressed sensing is utilized to develop an optimization model for recovering the cosparse signal and retaining maximum information. The state-of-the-art Joint Sparsity Induced Modified Common Spatial Pattern Algorithm and Low Rank Optimization Model (SCSP-LROM) can detect the stage and extent of growth of malignant cells, hemorrhages and lesions.","PeriodicalId":268343,"journal":{"name":"2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134447633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}