Sashakt Pathak, Arushi Agarwal, Ankita Ankita, M. Gurve
{"title":"Restricted Randomness DBSCAN : A faster DBSCAN Algorithm","authors":"Sashakt Pathak, Arushi Agarwal, Ankita Ankita, M. Gurve","doi":"10.1145/3474124.3474204","DOIUrl":"https://doi.org/10.1145/3474124.3474204","url":null,"abstract":"Data Mining is the process of extracting useful and accurate information or patterns from large databases using different algorithms and methods of machine learning. To analyze the data, Clustering is one of the methods in which similar data is grouped together and DBSCAN clustering algorithm is the one, which is broadly used in numerous practical applications. This paper presents a more efficient density based clustering algorithm, which has the ability to discover cluster faster than the existing DBSCAN algorithm. The efficiency is achieved by restricting the randomness of choosing points from the dataset. Our proposed algorithm named Restricted Randomness DBSCAN (RR DBSCAN) is compared with conventional DBSCAN algorithm over 9 datasets on the basis of Silhouette Coefficient, Time taken in formation of clusters and accuracy. The results show that RR DBSCAN performs better than traditional DBSCAN in terms of accuracy and time taken to form clusters.","PeriodicalId":144611,"journal":{"name":"2021 Thirteenth International Conference on Contemporary Computing (IC3-2021)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125317004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Parul Agarwal, Naima Farooqi, Aditya Gupta, S. Mehta, Saransh Khandelwal
{"title":"A New Harris Hawk Whale Optimization Algorithm for Enhancing Neural Networks","authors":"Parul Agarwal, Naima Farooqi, Aditya Gupta, S. Mehta, Saransh Khandelwal","doi":"10.1145/3474124.3474149","DOIUrl":"https://doi.org/10.1145/3474124.3474149","url":null,"abstract":"The learning process of artificial neural-networks is considered as one of the burdensome challenges to the researchers. The major dilemma of training the neural networks is the nonlinear nature and unknown controlling parameters like weights and biases. Slow convergence and trap in local optima are demerits of training neural network algorithms. To overcome these demerits, this work proposes a hybrid of Harris hawk optimization with a whale optimization algorithm to train the neural network. Harris hawk is a metaheuristic evolutionary algorithm and is used here to optimize the weights and bias of neural networks. The efficacy of the proposed algorithm is assessed by evaluating it on different kinds of cancer datasets and other datasets like fraud, banking note authentication. The experimental results demonstrate that the proposed algorithm performs better than its contemporary counterparts.","PeriodicalId":144611,"journal":{"name":"2021 Thirteenth International Conference on Contemporary Computing (IC3-2021)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125960586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MABTriage: Multi Armed Bandit Triaging Model Approach","authors":"Neetu Singh, S. Singh","doi":"10.1145/3474124.3474194","DOIUrl":"https://doi.org/10.1145/3474124.3474194","url":null,"abstract":"Recommendation of bugs to appropriate developers about whom we have very less or no information is a challenging problem faced in many open source developers community. In most of the reported works, this bug-triaging problem is handled through popular machine learning algorithms. However, in the absence of sufficient information of either a developer or a bug, it is difficult to build, train and test a conventional machine-learning model. One of the possible solutions in such a scenario is a reinforcement-learning model. In this paper, we propose an approach called MABTriage, to help a triager assign bugs to developers under uncertainty. To the best of our knowledge, it is the first work that has formulated bug-triaging process as a MAB problem. Experiments conducted on five publicly available open source datasets have shown that MABTriage approach performed better than a random selection. We have also evaluated the performance of six MAB algorithms -Greedy, -Decay, Softmax, Thompson Sampling, Optimistic Agent and UCB based on cumulative rewards. Results have shown that all five performed well in comparison to random selection.","PeriodicalId":144611,"journal":{"name":"2021 Thirteenth International Conference on Contemporary Computing (IC3-2021)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123718225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GATT: A Genetic Algorithm-based Tool for Automating Timetable Scheduling at Netaji Subhas University of Technology","authors":"Harnirvair Singh, R. Sibal, Sulabh Tyagi","doi":"10.1145/3474124.3474170","DOIUrl":"https://doi.org/10.1145/3474124.3474170","url":null,"abstract":"This paper describes the design of a Genetic Algorithm based Time Table Scheduling Tool for Netaji Subhas University of Technology (NSUT) based on courses, faculties, classrooms, and slots. The tool has an integrated database for storing data. The tool is named as Genetic Algorithm (GA) based TimeTabling (GATT) tool and is web-based. Both hard and soft constraints are incorporated. The hard constraints are implemented in a mandatory manner so that all hard conflicts are avoided. Then out of all feasible solutions, the goal is to maximize soft fitness scores by minimizing the number of soft conflicts. The GATT tool schedules courses after a series of iterations and the results were stored in a database. The final output is openly accessible from the web portal, while modifications if any can be made only by authorized personnel.","PeriodicalId":144611,"journal":{"name":"2021 Thirteenth International Conference on Contemporary Computing (IC3-2021)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124788226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effect of stationarity on traditional machine learning models: Time series analysis","authors":"Ankit Dixit, Shikhar Jain","doi":"10.1145/3474124.3474167","DOIUrl":"https://doi.org/10.1145/3474124.3474167","url":null,"abstract":"Recently, researchers have started the analysis of time series data. In time series data, it is difficult to apply prediction and forecasting techniques effectively. This research work examines how the nature of stationarity of time series data affects the accuracy and forecasting errors. Here, we first categorize the datasets into their stationarity type. Then some state-of- art models are applied to these datasets. Results show that traditional model accuracy and error in the case of forecasting become extremely vulnerable when datasets belong to the non-stationary category. Stationarity tests and experiments are performed on different kinds of benchmark datasets and results are analyzed.","PeriodicalId":144611,"journal":{"name":"2021 Thirteenth International Conference on Contemporary Computing (IC3-2021)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125038254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Platforms for Edge Computing and Internet of Things applications: A survey","authors":"Daniel Balouek-Thomert, M. Parashar","doi":"10.1145/3474124.3474143","DOIUrl":"https://doi.org/10.1145/3474124.3474143","url":null,"abstract":"The Internet of Things fosters an emerging class of analytics that connects sensors, vehicles, industries, and consumers through the internet to enable scientific and industrial applications. These services require a large computing capacity to perform well, while often being under the constraints to move data from the edge of the network to the cloud. Also, they require system support to program reactions that occur at runtime, especially when the target infrastructure capacities and capabilities are unknown during the design. The core of this survey is a comprehensive review of existing components for Edge Computing and Internet of Things applications. In recent years, the landscape of the edge middleware platforms has grown exponentially with more than a hundred available solutions in academia and industry. Such platforms are required to provide necessary components for sensor registration, resource discovery, workflow composition, and data processing. In this regard, we surveyed existing solutions through the lens of a simplified three-layer architecture for the design of edge-based middleware, along with design goals for each of the proposed layers. The paper concludes with some open challenges and possible future research directions.","PeriodicalId":144611,"journal":{"name":"2021 Thirteenth International Conference on Contemporary Computing (IC3-2021)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131791496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Cheung, S. Kuppannagari, R. Kannan, V. Prasanna
{"title":"Leveraging Spatial Information in Smart Grids using STGCN for Short-Term Load Forecasting","authors":"C. Cheung, S. Kuppannagari, R. Kannan, V. Prasanna","doi":"10.1145/3474124.3474145","DOIUrl":"https://doi.org/10.1145/3474124.3474145","url":null,"abstract":"The problem of predicting the behaviour of energy consumers (loads) in the next few intervals — Short-Term Load Forecasting (STLF) is critical to the success of several grid operations. Prediction at lower aggregation levels is difficult due to the high volatility of the data. Smart grid operations, and in turn any data generated as a result of them, exhibit high spatial correlations imposed due to the topology of the power distribution network as well as other latent factors such as similarity in neighborhood, socio-economic status, etc. While temporal information is usually leveraged in neural network structures like Recurrent or Convolutional Layers, the use of spatial information in load forecasting has not been explored. In this paper, we develop a Spatial-Temporal Graph Convolutional Network (STGCN) model for the problem of Short-Term Load Forecasting in Smart Grids. STGCNs specialize in capturing both spatial and temporal correlations in the data to obtain more accurate predictions. We also show that our model, by capturing both spatial and temporal correlations, is more robust to missing data than state-of-the-art prediction models. We perform detailed evaluation on a dataset based in Iowa, US with real power at a low aggregation level (5 ∼ 10 customers per datapoint) and show that our model predicts 3 hours ahead real load consumption with a Mean Absolute Error of 7.54% less than the best performing baseline model, and as much as 38.72% less in Root Mean Squared Error (RMSE) if the data has missing entries.","PeriodicalId":144611,"journal":{"name":"2021 Thirteenth International Conference on Contemporary Computing (IC3-2021)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132732734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Video Watermarking in Frequency Domain for Copyright Protection","authors":"Roop Singh, Alaknanda Ashok, M. Saraswat","doi":"10.1145/3474124.3474148","DOIUrl":"https://doi.org/10.1145/3474124.3474148","url":null,"abstract":"The problem of illegal video piracy can be overcome by the video watermarking method. Therefore, this paper presents a robust video watermarking technique for copyright protection based on redundant discrete wavelet transform (RDWT) and Schur transform. A grayscale watermark image is concealed into the 1-level LL sub-band of the video frame’s luminance channel (Y). The matrices, namely peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and normalized cross-correlation (NC) are used to assess the imperceptibility and robustness. The experimental results validate the performance of the proposed method.","PeriodicalId":144611,"journal":{"name":"2021 Thirteenth International Conference on Contemporary Computing (IC3-2021)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133746951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Revisiting Machine Learning Training Process for Enhanced Data Privacy","authors":"Adit Goyal, Vikas Hassija, V. Albuquerque","doi":"10.1145/3474124.3474208","DOIUrl":"https://doi.org/10.1145/3474124.3474208","url":null,"abstract":"The increasing use of machine learning algorithms for nearly every aspect of our lives has brought a new challenge to the forefront, one of user-privacy. Once the data has been shared by the user online, it is difficult to revoke the access of that data if it has already been used to train the model. For any personal data, every user should reserve the right for the data to be forgotten. To solve the above-mentioned problem, a few frameworks have been introduced recently to achieve machine unlearning or inverse learning. Although there is no specific definition of forgetting in DNNs (deep neural networks) yet, our focus will be on selectively forgetting a subset of data belonging to a class, which was initially used to train the model, without the need of re-training from scratch, nor using the initial training data. This method scrubs the weights clean of the data that needs to be forgotten. Concepts for the stability of stochastic gradient descent and differential privacy are exploited in this approach to address the problem of selective forgetting in DNNs.","PeriodicalId":144611,"journal":{"name":"2021 Thirteenth International Conference on Contemporary Computing (IC3-2021)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123788474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Anandaram, Ashik Mathew, A. Jyothish, P. Vinod, F. Mercaldo
{"title":"Hide and Seek Game: A Machine Learning Approach for Detecting Malicious Samples in Analysis Environment","authors":"S. Anandaram, Ashik Mathew, A. Jyothish, P. Vinod, F. Mercaldo","doi":"10.1145/3474124.3474211","DOIUrl":"https://doi.org/10.1145/3474124.3474211","url":null,"abstract":"In this work, we investigate whether malware understands the analysis environment. This analysis is carried out by executing a set of real malicious programs and benign samples on virtual and native machines. The result of execution is API sequence collected independently from virtual machines and host systems. In order to enhance the detection rate and accuracy, we have introduced four feature selection techniques. Thus, identified that feature reduction methods enhance the detection rate to a considerable extent. The experimental study depicted that while classifying malware and benign samples in virtual machines, most of the samples have misclassified, giving a clear indication that many malware samples remain dormant on identifying sandbox environment.","PeriodicalId":144611,"journal":{"name":"2021 Thirteenth International Conference on Contemporary Computing (IC3-2021)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124768354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}