Big DataPub Date : 2023-10-01Epub Date: 2023-01-19DOI: 10.1089/big.2021.0365
Jiabing Xu, Jiarui Liu, Tianen Yao, Yang Li
{"title":"Prediction and Big Data Impact Analysis of Telecom Churn by Backpropagation Neural Network Algorithm from the Perspective of Business Model.","authors":"Jiabing Xu, Jiarui Liu, Tianen Yao, Yang Li","doi":"10.1089/big.2021.0365","DOIUrl":"10.1089/big.2021.0365","url":null,"abstract":"<p><p>This study aims to transform the existing telecom operators from traditional Internet operators to digital-driven services, and improve the overall competitiveness of telecom enterprises. Data mining is applied to telecom user classification to process the existing telecom user data through data integration, cleaning, standardization, and transformation. Although the existing algorithms ensure the accuracy of the algorithm on the telecom user analysis platform under big data, they do not solve the limitations of single machine computing and cannot effectively improve the training efficiency of the model. To solve this problem, this article establishes a telecom customer churn prediction model with the help of backpropagation neural network (BPNN) algorithm, and deploys the MapReduce programming framework on Hadoop platform. Using the data of a telecom company, this article analyzes the loss of telecom customers in the big data environment. The research shows that the accuracy of telecom customer churn prediction model in BPNN is 82.12%. After deploying large data sets, the learning and training time of the model is greatly shortened. When the number of nodes is 8, the acceleration ratio of the model remains at 60 seconds. Under big data, the telecom user analysis platform not only ensures the accuracy of the algorithm, but also solves the limitations of single machine computing and effectively improves the training efficiency of the model. Compared with that of the existing research, the accuracy of the model is improved by 25.36%, and the running time is shortened by about twice. This business model based on BPNN algorithm has obvious advantages in processing more data sets, and has great reference value for the digital-driven business model transformation of the telecommunications industry.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"355-368"},"PeriodicalIF":4.6,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10549823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2023-10-01Epub Date: 2022-01-06DOI: 10.1089/big.2021.0279
Aisha Batool, Muhammad Wasif Nisar, Jamal Hussain Shah, Muhammad Attique Khan, Ahmed A Abd El-Latif
{"title":"iELMNet: Integrating Novel Improved Extreme Learning Machine and Convolutional Neural Network Model for Traffic Sign Detection.","authors":"Aisha Batool, Muhammad Wasif Nisar, Jamal Hussain Shah, Muhammad Attique Khan, Ahmed A Abd El-Latif","doi":"10.1089/big.2021.0279","DOIUrl":"10.1089/big.2021.0279","url":null,"abstract":"<p><p>Traffic sign detection (TSD) in real-time environment holds great importance for applications such as automated-driven vehicles. Large variety of traffic signs, different appearances, and spatial representations causes a huge intraclass variation. In this article, an extreme learning machine (ELM), convolutional neural network (CNN), and scale transformation (ST)-based model, called improved extreme learning machine network, are proposed to detect traffic signs in real-time environment. The proposed model has a custom DenseNet-based novel CNN architecture, improved version of region proposal networks called accurate anchor prediction model (A2PM), ST, and ELM module. CNN architecture makes use of handcrafted features such as scale-invariant feature transform and Gabor to improvise the edges of traffic signs. The A2PM minimizes the redundancy among extracted features to make the model efficient and ST enables the model to detect traffic signs of different sizes. ELM module enhances the efficiency by reshaping the features. The proposed model is tested on three publicly available data sets, challenging unreal and real environments for traffic sign recognition, Tsinghua-Tencent 100K, and German traffic sign detection benchmark and achieves average precisions of 93.31%, 95.22%, and 99.45%, respectively. These results prove that the proposed model is more efficient than state-of-the-art sign detection techniques.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"323-338"},"PeriodicalIF":4.6,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39655008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2023-10-01Epub Date: 2023-01-27DOI: 10.1089/big.2021.0343
Chen Tao
{"title":"Applications of Bayesian Neural Networks in Outlier Detection.","authors":"Chen Tao","doi":"10.1089/big.2021.0343","DOIUrl":"10.1089/big.2021.0343","url":null,"abstract":"<p><p>Anomaly detection is crucial in a variety of domains, such as fraud detection, disease diagnosis, and equipment defect detection. With the development of deep learning, anomaly detection with Bayesian neural networks (BNNs) becomes a novel research topic in recent years. This article aims to propose a widely applicable method of outlier detection (a category of anomaly detection) using BNNs based on uncertainty measurement. There are three kinds of uncertainties generated in the prediction of BNNs: epistemic uncertainty, aleatoric uncertainty, and (model) misspecification uncertainty. Although the approaches in previous studies are adopted to measure epistemic and aleatoric uncertainty, a new method of utilizing loss functions to quantify misspecification uncertainty is proposed in this article. Then, these three uncertainty sources are merged together by specific combination models to construct total prediction uncertainty. In this study, the key idea is that the observations with high total prediction uncertainty should correspond to outliers in the data. The method of this research is applied to the experiments on Modified National Institute of Standards and Technology (MNIST) dataset and Taxi dataset, respectively. From the results, if the network is appropriately constructed and well-trained and model parameters are carefully tuned, most anomalous images in MNIST dataset and all the abnormal traffic periods in Taxi dataset can be nicely detected. In addition, the performance of this method is compared with the BNN anomaly detection methods proposed before and the classical Local Outlier Factor and Density-Based Spatial Clustering of Applications with Noise methods. This study links the classification of uncertainties in essence with anomaly detection and takes the lead to consider combining different uncertainty sources to reform detection outcomes instead of using only single uncertainty each time.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"369-386"},"PeriodicalIF":4.6,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10681813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2023-10-01DOI: 10.1089/big.2023.29062.editorial
Chinmay Chakraborty, Muhammad Khurram Khan
{"title":"Big Data-Driven Futuristic Fabric System in Societal Digital Transformation.","authors":"Chinmay Chakraborty, Muhammad Khurram Khan","doi":"10.1089/big.2023.29062.editorial","DOIUrl":"10.1089/big.2023.29062.editorial","url":null,"abstract":"","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"11 5","pages":"321-322"},"PeriodicalIF":4.6,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41219740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2023-08-21DOI: 10.3390/engproc2023038091
Manying Shi, Fang Luo, Hanping Ke, Shiliang Zhang
{"title":"Design and Analysis of Education Personalized Recommendation System under Vision of System Science Communication","authors":"Manying Shi, Fang Luo, Hanping Ke, Shiliang Zhang","doi":"10.3390/engproc2023038091","DOIUrl":"https://doi.org/10.3390/engproc2023038091","url":null,"abstract":"","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"1 1","pages":""},"PeriodicalIF":4.6,"publicationDate":"2023-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90898197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2023-08-03DOI: 10.1109/icABCD59051.2023.10220520
A. Periola, K. Ogudo, A. Alonge
{"title":"Realizing the Potential of Stratosphere Utilization via Stratosphere Data Centers","authors":"A. Periola, K. Ogudo, A. Alonge","doi":"10.1109/icABCD59051.2023.10220520","DOIUrl":"https://doi.org/10.1109/icABCD59051.2023.10220520","url":null,"abstract":"The stratosphere is an aeronautical resource whose use is of benefit to the government in delivering aviation services. It also provides a freely cooling environment making it suitable for hosting non-terrestrial data centers. However, the development of a framework enabling the utilization of the stratosphere requires further research attention. The research presents a multientity architecture that describes the role of a stratosphere-bound airport that supports the deployment and use of future stratosphere-based data centers. The solution being presented is intended to increase the operational duration of future deployed stratosphere-based data centers. The focus here is on enhancing the operational duration of the stratosphere-based data center. This is important for its role in future networks. Analysis shows that the proposed solution improved the operational duration by at least 33% and by up to 76% on average.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"4 1","pages":"1-6"},"PeriodicalIF":4.6,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80701322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2023-08-03DOI: 10.1109/icABCD59051.2023.10220494
M. Okwu, I. Emovon, O. J. Oyejide, Kingsley C. Ezekiel, Olaye Messiah, Perpetua C. Jones-Iwuagwu
{"title":"Performance Analysis of a Light Weight Ground Robotic Vehicle by Implementing Adaptive Neuro-Fuzzy Inference System (ANFIS)","authors":"M. Okwu, I. Emovon, O. J. Oyejide, Kingsley C. Ezekiel, Olaye Messiah, Perpetua C. Jones-Iwuagwu","doi":"10.1109/icABCD59051.2023.10220494","DOIUrl":"https://doi.org/10.1109/icABCD59051.2023.10220494","url":null,"abstract":"Automated Guided Vehicles (AGVs) are widely used as delivery agents and for material transportation in factories, hospital environment, and other facilities. Conducting performance tests on AGVs has the potential to ratify and improve the efficiency, and reliability of the system. However, published studies on performance analysis focused on classical metrics for such evaluation. In this study, the emphasis is on the performance evaluation of a developed lightweight AGV using the Adaptive Neuro-fuzzy Inference System (ANFIS). The developed line following AGV is flexible, intelligent, and nifty, and can be accessed wirelessly, and controlled by an operator. It was programmed to avoid collision with the help of a proximity sensor attached. The performance test was conducted by drawing black lines on a plain surface for easy navigation of the AGV. A series of experiments was carried out by using realistic test variables like the navigation pattern of AGV, test accuracy, energy efficiency, obstacle avoidance, task accomplishment, and others. Sensitivity analysis was done using the ANFIS surface plot. The total system intelligence (TSI) obtained for the different trials are 76%; 79%; 80%; 81%; 79% and 81 %, for the first, second, third, fourth, fifth, and final trials respectively. The preeminent observable performance was the fourth and sixth trials, obtained at 81 %. The outcome of the investigation reveals that the ANFIS model is an efficient soft computing technique capable of performing TSI tests of AGVs with a high degree of accuracy. The model is also recommended in AGV platooning.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"52 1","pages":"1-7"},"PeriodicalIF":4.6,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85157357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2023-08-03DOI: 10.1109/icABCD59051.2023.10220515
Bhanu Prakash Reddy Banda, Bianca Govan, K. Roy, Kelvin S. Bryant
{"title":"Malware detection using Explainable ML models based on Feature Extraction using API calls","authors":"Bhanu Prakash Reddy Banda, Bianca Govan, K. Roy, Kelvin S. Bryant","doi":"10.1109/icABCD59051.2023.10220515","DOIUrl":"https://doi.org/10.1109/icABCD59051.2023.10220515","url":null,"abstract":"Malware attacks have become a crucial problem in modern life. From 2015 to 2021 about 56.1billion malware attacks have taken place in the world. A malware attack typically costs a business over 2.5 million dollars to remediate. According to Cybersecurity Ventures, during the next five years, the cost of cybercrime would increase by 15% yearly, reaching 10.5 trillion USD annually by 2025 from 3 trillion USD in 2015. There is a global epidemic of malware. Studies imply that malware's effects are deteriorating. The main defense against malware tools is malware detectors. Therefore, it is crucial that we research malware detection methods to better comprehend their advantages and disadvantages. This research focuses on an Application Pro-gramming Interface (API) call-based malware detection strategy with Machine Learning to further improve malware detection. The Limitations that motivated to work on this project was the lack of datasets with newly attacked malware samples and also lack of detecting the malware with good accuracy. The main goal of this research is to understand the malware behavior on the Windows platform, use a dynamic analysis to identify various aspects or features that have dangerous code patterns from malware samples and employ various malware and benign samples to construct and validate machine learning-based malware detection models. The data was gathered from publicly accessible sites and sampled using a sandbox approach. Machine Learning models were built using the new dataset. The Supervised Learning models and deep Learning models were applied to the dataset and then the results were compared and cross-checked to get the best fit model. This investigation demonstrated the possibility of estab- lishing a high-precision capability for the detection of malware while combining API calls and Machine Learning models., The strategy yielded a high malware detection accuracy of 88.26% (XGBoost) model and 90.70% (MLP classifier) for Windows-based platforms. We have used Explainable Machine Learning, namely the SHapley Additive exPlanations (SHAP) value methods to demonstrate the important component or feature responsible for the prediction of the model.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"3 1","pages":"1-7"},"PeriodicalIF":4.6,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85206168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2023-08-03DOI: 10.1109/icABCD59051.2023.10220490
Funmilayo S. Moninuola, E. Adetiba, Anthony A. Atayero, A. Awelewa, A. Adeyeye, Oluwadamilola Oshin, J. Ameh, A. Abayomi, Victor Ezekiel
{"title":"Early Detection of Lung Cancer via Breath Analysis Utilising Electronic Nose","authors":"Funmilayo S. Moninuola, E. Adetiba, Anthony A. Atayero, A. Awelewa, A. Adeyeye, Oluwadamilola Oshin, J. Ameh, A. Abayomi, Victor Ezekiel","doi":"10.1109/icABCD59051.2023.10220490","DOIUrl":"https://doi.org/10.1109/icABCD59051.2023.10220490","url":null,"abstract":"Lung Cancer (LC), have the highest mortality rate and the second-highest incidence rate of all cancers combined because of a pathophysiological imbalance in the fundamental mechanism of cell proliferation. For patients with LC, prompt diagnosis and treatment are of utmost importance. The orthodox methods employed for detecting LC are characterised by invasiveness, protracted duration, high cost and exhibit reduced efficacy in detecting malignant cells during the initial phases of the ailment. The increasing attention of researchers toward the potential of utilising Volatile Organic Compound (VOC) biomarkers for the non-invasive detection of LC can be attributed to the advancements in techniques and procedures. This study offers a state-of-the-art portable E-nose that has the potential to enhance clinical outcomes associated with the early diagnosis of LC. Three ML models - SVM, AdaBoost, and MLP were employed to discriminate LC from other respiratory breathprint dataset. The MLP model achieved the highest performance accuracy result of 89.05%, specificity 95.12%, and sensitivity of 80%.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"9 1","pages":"1-6"},"PeriodicalIF":4.6,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82427584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}