Big DataPub Date : 2024-10-23DOI: 10.1089/big.2023.0131
Qi Ouyang, Hongchang Chen, Shuxin Liu, Liming Pu, Dongdong Ge, Ke Fan
{"title":"DMHANT: DropMessage Hypergraph Attention Network for Information Propagation Prediction.","authors":"Qi Ouyang, Hongchang Chen, Shuxin Liu, Liming Pu, Dongdong Ge, Ke Fan","doi":"10.1089/big.2023.0131","DOIUrl":"https://doi.org/10.1089/big.2023.0131","url":null,"abstract":"<p><p>Predicting propagation cascades is crucial for understanding information propagation in social networks. Existing methods always focus on structure or order of infected users in a single cascade sequence, ignoring the global dependencies of cascades and users, which is insufficient to characterize their dynamic interaction preferences. Moreover, existing methods are poor at addressing the problem of model robustness. To address these issues, we propose a predication model named DropMessage Hypergraph Attention Networks, which constructs a hypergraph based on the cascade sequence. Specifically, to dynamically obtain user preferences, we divide the diffusion hypergraph into multiple subgraphs according to the time stamps, develop hypergraph attention networks to explicitly learn complete interactions, and adopt a gated fusion strategy to connect them for user cascade prediction. In addition, a new drop immediately method DropMessage is added to increase the robustness of the model. Experimental results on three real-world datasets indicate that proposed model significantly outperforms the most advanced information propagation prediction model in both MAP@k and Hits@K metrics, and the experiment also proves that the model achieves more significant prediction performance than the existing model under data perturbation.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142512575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Maximizing Influence in Social Networks Using Combined Local Features and Deep Learning-Based Node Embedding.","authors":"Asgarali Bouyer, Hamid Ahmadi Beni, Amin Golzari Oskouei, Alireza Rouhi, Bahman Arasteh, Xiaoyang Liu","doi":"10.1089/big.2023.0117","DOIUrl":"https://doi.org/10.1089/big.2023.0117","url":null,"abstract":"<p><p>The influence maximization problem has several issues, including low infection rates and high time complexity. Many proposed methods are not suitable for large-scale networks due to their time complexity or free parameter usage. To address these challenges, this article proposes a local heuristic called Embedding Technique for Influence Maximization (ETIM) that uses shell decomposition, graph embedding, and reduction, as well as combined local structural features. The algorithm selects candidate nodes based on their connections among network shells and topological features, reducing the search space and computational overhead. It uses a deep learning-based node embedding technique to create a multidimensional vector of candidate nodes and calculates the dependency on spreading for each node based on local topological features. Finally, influential nodes are identified using the results of the previous phases and newly defined local features. The proposed algorithm is evaluated using the independent cascade model, showing its competitiveness and ability to achieve the best performance in terms of solution quality. Compared with the collective influence global algorithm, ETIM is significantly faster and improves the infection rate by an average of 12%.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142480288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2024-10-01Epub Date: 2021-12-13DOI: 10.1089/big.2021.0176
R Thenmozhi, S Shridevi, Sachi Nandan Mohanty, Vicente García-Díaz, Deepak Gupta, Prayag Tiwari, Mohammad Shorfuzzaman
{"title":"Attribute-Based Adaptive Homomorphic Encryption for Big Data Security.","authors":"R Thenmozhi, S Shridevi, Sachi Nandan Mohanty, Vicente García-Díaz, Deepak Gupta, Prayag Tiwari, Mohammad Shorfuzzaman","doi":"10.1089/big.2021.0176","DOIUrl":"10.1089/big.2021.0176","url":null,"abstract":"<p><p>There is a drastic increase in Internet usage across the globe, thanks to mobile phone penetration. This extreme Internet usage generates huge volumes of data, in other terms, big data. Security and privacy are the main issues to be considered in big data management. Hence, in this article, Attribute-based Adaptive Homomorphic Encryption (AAHE) is developed to enhance the security of big data. In the proposed methodology, Oppositional Based Black Widow Optimization (OBWO) is introduced to select the optimal key parameters by following the AAHE method. By considering oppositional function, Black Widow Optimization (BWO) convergence analysis was enhanced. The proposed methodology has different processes, namely, process setup, encryption, and decryption processes. The researcher evaluated the proposed methodology with non-abelian rings and the homomorphism process in ciphertext format. Further, it is also utilized in improving one-way security related to the conjugacy examination issue. Afterward, homomorphic encryption is developed to secure the big data. The study considered two types of big data such as adult datasets and anonymous Microsoft web datasets to validate the proposed methodology. With the help of performance metrics such as encryption time, decryption time, key size, processing time, downloading, and uploading time, the proposed method was evaluated and compared against conventional cryptography techniques such as Rivest-Shamir-Adleman (RSA) and Elliptic Curve Cryptography (ECC). Further, the key generation process was also compared against conventional methods such as BWO, Particle Swarm Optimization (PSO), and Firefly Algorithm (FA). The results established that the proposed method is supreme than the compared methods and can be applied in real time in near future.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39718084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2024-10-01Epub Date: 2022-02-02DOI: 10.1089/big.2021.0251
Fei Dai, Pengfei Cao, Penggui Huang, Qi Mo, Bi Huang
{"title":"Hybrid Deep Learning Approach for Traffic Speed Prediction.","authors":"Fei Dai, Pengfei Cao, Penggui Huang, Qi Mo, Bi Huang","doi":"10.1089/big.2021.0251","DOIUrl":"10.1089/big.2021.0251","url":null,"abstract":"<p><p>Traffic speed prediction plays a fundamental role in traffic management and driving route planning. However, timely accurate traffic speed prediction is challenging as it is affected by complex spatial and temporal correlations. Most existing works cannot simultaneously model spatial and temporal correlations in traffic data, resulting in unsatisfactory prediction performance. In this article, we propose a novel hybrid deep learning approach, named HDL4TSP, to predict traffic speed in each region of a city, which consists of an input layer, a spatial layer, a temporal layer, a fusion layer, and an output layer. Specifically, first, the spatial layer employs graph convolutional networks to capture spatial near dependencies and spatial distant dependencies in the spatial dimension. Second, the temporal layer employs convolutional long short-term memory (ConvLSTM) networks to model closeness, daily periodicity, and weekly periodicity in the temporal dimension. Third, the fusion layer designs a fusion component to merge the outputs of ConvLSTM networks. Finally, we conduct extensive experiments and experimental results to show that HDL4TSP outperforms four baselines on two real-world data sets.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39880866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2024-10-01Epub Date: 2022-06-14DOI: 10.1089/big.2021.0268
Muhammad Basit Umair, Zeshan Iqbal, Muhammad Ahmad Faraz, Muhammad Attique Khan, Yu-Dong Zhang, Navid Razmjooy, Sefedine Kadry
{"title":"A Network Intrusion Detection System Using Hybrid Multilayer Deep Learning Model.","authors":"Muhammad Basit Umair, Zeshan Iqbal, Muhammad Ahmad Faraz, Muhammad Attique Khan, Yu-Dong Zhang, Navid Razmjooy, Sefedine Kadry","doi":"10.1089/big.2021.0268","DOIUrl":"10.1089/big.2021.0268","url":null,"abstract":"<p><p>An intrusion detection system (IDS) is designed to detect and analyze network traffic for suspicious activity. Several methods have been introduced in the literature for IDSs; however, due to a large amount of data, these models have failed to achieve high accuracy. A statistical approach is proposed in this research due to the unsatisfactory results of traditional intrusion detection methods. The features are extracted and selected using a multilayer convolutional neural network, and a softmax classifier is employed to classify the network intrusions. To perform further analysis, a multilayer deep neural network is also applied to classify network intrusions. Furthermore, the experiments are performed using two commonly used benchmark intrusion detection datasets: NSL-KDD and KDDCUP'99. The performance of the proposed model is evaluated using four performance metrics: accuracy, recall, F1-score, and precision. The experimental results show that the proposed approach achieved better accuracy (99%) compared with other IDSs.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47174126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2024-10-01Epub Date: 2022-04-11DOI: 10.1089/big.2021.0301
Juyan Li, Jialiang Peng, Zhiqi Qiao
{"title":"A Ring Learning with Errors-Based Ciphertext-Policy Attribute-Based Proxy Re-Encryption Scheme for Secure Big Data Sharing in Cloud Environment.","authors":"Juyan Li, Jialiang Peng, Zhiqi Qiao","doi":"10.1089/big.2021.0301","DOIUrl":"10.1089/big.2021.0301","url":null,"abstract":"<p><p>Owing to the huge volume of big data, users generally use the cloud to store big data. However, because the data are out of the control of users, sensitive data need to be protected. The ciphertext-policy attribute-based encryption scheme can not only effectively control the access of big data, but also decrypt the ciphertext as long as the user's attributes satisfy the access structure of ciphertext, so as to realize one to many big data sharing. When the user's attributes do not satisfy the access structure of ciphertext, the attribute-based proxy re-encryption scheme can be used for big data sharing. The ciphertext-policy attribute-based proxy re-encryption (CP-ABPRE) scheme combines the characteristics of the ciphertext-policy attribute-based encryption scheme and proxy re-encryption scheme. In a CP-ABPRE scheme, on the one hand, the data owner can use the ciphertext-policy attribute-based encryption scheme to encrypt the big data for cloud storage, to realize the access control of the big data. On the other hand, the proxy (cloud service provider) can convert ciphertext under one access structure into ciphertext under another access structure, thus realizing big data sharing between users of different attribute sets. In this article, we modify the existing attribute-based encryption scheme based on Ring Learning With Errors (RLWE), add re-encryption key generation algorithm, re-encryption ciphertext generation algorithm, and re-encryption ciphertext decryption algorithm, and construct CP-ABPRE scheme. In the construction of the re-encryption key, we introduce a random vector and hide the vector in the key by threshold technology. Finally, a CP-ABPRE scheme supporting threshold access structure is constructed based on RLWE. Compared with the existing attribute-based proxy re-encryption schemes, our scheme has smaller public parameters, can encrypt multiple plaintext bits at a time, and can resist selective access structure and chosen plaintext attack, so it is more suitable for big data sharing in cloud environment.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45347288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2024-10-01Epub Date: 2023-08-01DOI: 10.1089/big.2021.0473
Dibin Shan, Xuehui Du, Wenjuan Wang, Aodi Liu, Na Wang
{"title":"A Weighted GraphSAGE-Based Context-Aware Approach for Big Data Access Control.","authors":"Dibin Shan, Xuehui Du, Wenjuan Wang, Aodi Liu, Na Wang","doi":"10.1089/big.2021.0473","DOIUrl":"10.1089/big.2021.0473","url":null,"abstract":"<p><p>Context information is the key element to realizing dynamic access control of big data. However, existing context-aware access control (CAAC) methods do not support automatic context awareness and cannot automatically model and reason about context relationships. To solve these problems, this article proposes a weighted GraphSAGE-based context-aware approach for big data access control. First, graph modeling is performed on the access record data set and transforms the access control context-awareness problem into a graph neural network (GNN) node learning problem. Then, a GNN model WGraphSAGE is proposed to achieve automatic context awareness and automatic generation of CAAC rules. Finally, weighted neighbor sampling and weighted aggregation algorithms are designed for the model to realize automatic modeling and reasoning of node relationships and relationship strengths simultaneously in the graph node learning process. The experiment results show that the proposed method has obvious advantages in context awareness and context relationship reasoning compared with similar GNN models. Meanwhile, it obtains better results in dynamic access control decisions than the existing CAAC models.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9922924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2024-08-01Epub Date: 2024-07-31DOI: 10.1089/big.2024.59218.kpa
Farhad Pourkamali-Anaraki
{"title":"Special Issue: Big Scientific Data and Machine Learning in Science and Engineering.","authors":"Farhad Pourkamali-Anaraki","doi":"10.1089/big.2024.59218.kpa","DOIUrl":"10.1089/big.2024.59218.kpa","url":null,"abstract":"","PeriodicalId":51314,"journal":{"name":"Big Data","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141857096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2024-08-01Epub Date: 2023-03-22DOI: 10.1089/big.2022.0050
Vijay Srinivas Tida, Sonya Hsu, Xiali Hei
{"title":"A Unified Training Process for Fake News Detection Based on Finetuned Bidirectional Encoder Representation from Transformers Model.","authors":"Vijay Srinivas Tida, Sonya Hsu, Xiali Hei","doi":"10.1089/big.2022.0050","DOIUrl":"10.1089/big.2022.0050","url":null,"abstract":"<p><p>An efficient fake news detector becomes essential as the accessibility of social media platforms increases rapidly. Previous studies mainly focused on designing the models solely based on individual data sets and might suffer from degradable performance. Therefore, developing a robust model for a combined data set with diverse knowledge becomes crucial. However, designing the model with a combined data set requires extensive training time and sequential workload to obtain optimal performance without having some prior knowledge about the model's parameters. The presented study here will help solve these issues by introducing the unified training strategy to have a base structure for the classifier and all hyperparameters from individual models using a pretrained transformer model. The performance of the proposed model is noted using three publicly available data sets, namely ISOT and others from the Kaggle website. The results indicate that the proposed unified training strategy surpassed the existing models such as Random Forests, convolutional neural networks, and long short-term memory, with 97% accuracy and achieved the F1 score of 0.97. Furthermore, there was a significant reduction in training time by almost 1.5 to 1.8 × by removing words lower than three letters from the input samples. We also did extensive performance analysis by varying the number of encoder blocks to build compact models and trained on the combined data set. We justify that reducing encoder blocks resulted in lower performance from the obtained results.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9150389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2024-08-01Epub Date: 2023-09-04DOI: 10.1089/big.2022.0086
Derya Turfan, Bulent Altunkaynak, Özgür Yeniay
{"title":"A New Filter Approach Based on Effective Ranges for Classification of Gene Expression Data.","authors":"Derya Turfan, Bulent Altunkaynak, Özgür Yeniay","doi":"10.1089/big.2022.0086","DOIUrl":"10.1089/big.2022.0086","url":null,"abstract":"<p><p>Over the years, many studies have been carried out to reduce and eliminate the effects of diseases on human health. Gene expression data sets play a critical role in diagnosing and treating diseases. These data sets consist of thousands of genes and a small number of sample sizes. This situation creates the curse of dimensionality and it becomes problematic to analyze such data sets. One of the most effective strategies to solve this problem is feature selection methods. Feature selection is a preprocessing step to improve classification performance by selecting the most relevant and informative features while increasing the accuracy of classification. In this article, we propose a new statistically based filter method for the feature selection approach named Effective Range-based Feature Selection Algorithm (FSAER). As an extension of the previous Effective Range based Gene Selection (ERGS) and Improved Feature Selection based on Effective Range (IFSER) algorithms, our novel method includes the advantages of both methods while taking into account the disjoint area. To illustrate the efficacy of the proposed algorithm, the experiments have been conducted on six benchmark gene expression data sets. The results of the FSAER and the other filter methods have been compared in terms of classification accuracies to demonstrate the effectiveness of the proposed method. For classification methods, support vector machines, naive Bayes classifier, and k-nearest neighbor algorithms have been used.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10211345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}