Big Data ResearchPub Date : 2024-04-16DOI: 10.1016/j.bdr.2024.100454
Jörg Raab , Yuting Pang , Joan Baaijens , Honggeng Zhou
{"title":"Big Data in organizations: Exploring the adoption of Big Data applications and their impact on organizations in China and the Netherlands","authors":"Jörg Raab , Yuting Pang , Joan Baaijens , Honggeng Zhou","doi":"10.1016/j.bdr.2024.100454","DOIUrl":"10.1016/j.bdr.2024.100454","url":null,"abstract":"<div><p>Digital technology has rapidly been transforming how organizations operate. However, the literature in management studies has only just started to problematize the fundamental inter-relation of digital technology and organizing and we lack sound data about the actual breadth and depth of these changes. This study therefore explores the state of the implementation of Big Data applications in a wide range of organizations in China and the Netherlands and the impact on organizational structures and processes. Our findings show that most organizations are still in an experimental phase at best. We can therefore observe an evolutionary model of technology adoption</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100454"},"PeriodicalIF":3.3,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140796332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Machine Learning for Tsunami Waves Forecasting Using Regression Trees","authors":"Eugenio Cesario , Salvatore Giampá , Enrico Baglione , Louise Cordrie , Jacopo Selva , Domenico Talia","doi":"10.1016/j.bdr.2024.100452","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100452","url":null,"abstract":"<div><p>After a seismic event, tsunami early warning systems (TEWSs) try to accurately forecast the maximum height of incident waves at specific target points in front of the coast, so that early warnings can be launched on locations where the impact of tsunami waves can be destructive to deliver aids in these locations in the immediate post-event management. The uncertainty on the forecast can be quantified with ensembles of alternative scenarios. Similarly, in probabilistic tsunami hazard analysis (PTHA) a large number of simulations is required to cover the natural variability of the source process in each location. To improve the accuracy and computational efficiency of tsunami forecasting methods, scientists have recently started to exploit machine learning techniques to process pre-computed simulation data. However, the approaches proposed in literature, mainly based on neural networks, suffer of high training time and limited model explainability. To overtake these issues, this paper describes a machine learning approach based on regression trees to model and forecast tsunami evolutions. The algorithm takes as input a set of simulations forming an ensemble that describes potential benefit regional impact of tsunami source scenarios in a given source area, and it provides predictive models to forecast the tsunami waves for other potential tsunami sources in the same area. The experimental evaluation, performed on the 2003 M6.8 Zemmouri-Boumerdes earthquake and tsunami simulation data, shows that regression trees achieve high forecasting accuracy. Moreover, they provide domain experts with fully-explainable and interpretable models, which are a valuable support for environmental scientists because they describe underlying rules and patterns behind the models and allow for an explicit inspection of their functioning. This can enable a full and trustable exploration of source uncertainty in tsunami early-warning and urgent computing scenarios, with large ensembles of computationally light tsunami simulations.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100452"},"PeriodicalIF":3.3,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2214579624000285/pdfft?md5=942e994d950c715c0c020e511bc26341&pid=1-s2.0-S2214579624000285-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140559033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-04-04DOI: 10.1016/j.bdr.2024.100453
Helen Karatza
{"title":"Scheduling critical periodic jobs with selective partial computations along with gang jobs","authors":"Helen Karatza","doi":"10.1016/j.bdr.2024.100453","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100453","url":null,"abstract":"<div><p>One of the main issues with distributed systems, like clouds, is scheduling complex workloads, which are made up of various job types with distinct features. Gang jobs are one kind of parallel applications that these systems support. This paper examines the scheduling of workloads that comprise gangs and critical periodic jobs that can allow for partial computations when necessary to overcome gang job execution. The simulation's results shed important light on how gang performance is impacted by partial computations of critical jobs. The results also reveal that, under the proposed scheduling scheme, partial computations which take into account gangs’ degree of parallelism, might lower the average response time of gang jobs, resulting in an acceptable level of the average results precision of the critical jobs. Additionally, it is observed that as the deviation from the average partial computation increases, the performance improvement due to partial computations increases with the aforementioned tradeoff remaining significant.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100453"},"PeriodicalIF":3.3,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140547395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-03-26DOI: 10.1016/j.bdr.2024.100451
Anli Yan , Xiaozhang Liu , Wanman Li , Hongwei Ye , Lang Li
{"title":"Explanation-Guided Adversarial Example Attacks","authors":"Anli Yan , Xiaozhang Liu , Wanman Li , Hongwei Ye , Lang Li","doi":"10.1016/j.bdr.2024.100451","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100451","url":null,"abstract":"<div><p>Neural network-based classifiers are vulnerable to adversarial example attacks even in a black-box setting. Existing adversarial example generation technologies mainly rely on optimization-based attacks, which optimize the objective function by iterative input perturbation. While being able to craft adversarial examples, these techniques require big budgets. Latest transfer-based attacks, though being limited queries, also have a disadvantage of low attack success rate. In this paper, we propose an adversarial example attack method called MEAttack using the model-agnostic explanation technology, which can more efficiently generate adversarial examples in the black-box setting with limited queries. The core idea is to design a novel model-agnostic explanation method for target models, and generate adversarial examples based on model explanations. We experimentally demonstrate that MEAttack outperforms the state-of-the-art attack technology, i.e., AutoZOOM. The success rate of MEAttack is 4.54%-47.42% higher than AutoZOOM, and its query efficiency is reduced by 2.6-4.2 times. Experimental results show that MEAttack is efficient in terms of both attack success rate and query efficiency.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100451"},"PeriodicalIF":3.3,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140347942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-03-21DOI: 10.1016/j.bdr.2024.100450
Shichen Zhai , Xiaoping Lu , Chao Wang , Zhiyu Hong , Jing Shan , Zongmin Ma
{"title":"Correcting inconsistencies in knowledge graphs with correlated knowledge","authors":"Shichen Zhai , Xiaoping Lu , Chao Wang , Zhiyu Hong , Jing Shan , Zongmin Ma","doi":"10.1016/j.bdr.2024.100450","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100450","url":null,"abstract":"<div><p>Knowledge graphs (KGs) have been widely applied for semantic representation and intelligent decision-making. The usefulness and usability of KGs is often limited by quality of KGs. One common issue is the presence of inconsistent assertions in KGs. Inconsistencies in KGs are often caused by diverse data that are applied for automatically constructing large-scale KGs. To improve quality of KGs, in this paper, we investigate how to detect and correct inconsistent triples in KGs. We first identify entity-related inconsistency, relation-related inconsistency and type-related inconsistency. On the basis, we propose a framework of correcting the identified inconsistencies, which combines candidate generation, link prediction and constraint validation. We evaluate the proposed correction framework in the real-word dataset FB15k (from Freebase). The promising results confirm the capability of our framework in correcting the inconsistencies of knowledge graphs.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100450"},"PeriodicalIF":3.3,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140328544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-03-20DOI: 10.1016/j.bdr.2024.100449
Zehua Liu , Jiuhao Li , Mahmood Ashraf , M.S. Syam , Muhammad Asif , Emad Mahrous Awwad , Muna Al-Razgan , Uzair Aslam Bhatti
{"title":"Remote sensing-enhanced transfer learning approach for agricultural damage and change detection: A deep learning perspective","authors":"Zehua Liu , Jiuhao Li , Mahmood Ashraf , M.S. Syam , Muhammad Asif , Emad Mahrous Awwad , Muna Al-Razgan , Uzair Aslam Bhatti","doi":"10.1016/j.bdr.2024.100449","DOIUrl":"10.1016/j.bdr.2024.100449","url":null,"abstract":"<div><p>With the continuous advancement of science and technology, there has been a growing awareness of safety among people worldwide. Natural disasters such as wildfires, earthquakes, and floods pose persistent threats to both lives and property on our planet, which serves as our fundamental habitat. While it is impossible to prevent or entirely avert these calamities, rapid identification of affected areas and prompt damage assessment post-disaster can significantly aid in the formulation of effective rescue strategies, ultimately saving more lives. This article delves into the application of transfer learning in satellite image damage assessment—a methodology that involves transferring previously acquired knowledge to enhance a model's adaptability to new tasks. Given the limited availability of datasets for satellite image analysis, transfer learning proves to be an effective approach. Specifically, the study proposes a transfer learning method based on YOLOv5 for satellite image damage assessment. Initially, a general convolutional neural network model is trained using a substantial dataset of natural images. Subsequently, the early layers of this model are frozen, while the later layers undergo training to adapt to satellite image data. Fine-tuning is then employed to further enhance the overall model performance. The results demonstrate that this approach yields a high accuracy rate in satellite image damage assessment. Moreover, compared to conventional deep learning methods, the proposed method effectively leverages pre-trained models' knowledge, thereby reducing data dependency. Additionally, it displays robust generalization capabilities across diverse tasks and datasets, underscoring its potential for facilitating transfer learning across various domains.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100449"},"PeriodicalIF":3.3,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140275813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-03-20DOI: 10.1016/j.bdr.2024.100448
Min Peng , Yunxiang Liu , Asad Khan , Bilal Ahmed , Subrata K. Sarker , Yazeed Yasin Ghadi , Uzair Aslam Bhatti , Muna Al-Razgan , Yasser A. Ali
{"title":"Crop monitoring using remote sensing land use and land change data: Comparative analysis of deep learning methods using pre-trained CNN models","authors":"Min Peng , Yunxiang Liu , Asad Khan , Bilal Ahmed , Subrata K. Sarker , Yazeed Yasin Ghadi , Uzair Aslam Bhatti , Muna Al-Razgan , Yasser A. Ali","doi":"10.1016/j.bdr.2024.100448","DOIUrl":"10.1016/j.bdr.2024.100448","url":null,"abstract":"<div><p>In the context of the rapidly evolving climate dynamics of the early twenty-first century, the interplay between climate change and biospheric integrity is becoming increasingly critical. The pervasive impact of climate change on ecosystems is manifested not only through alterations in average environmental conditions and their variability but also through ancillary shifts such as escalated oceanic acidification and heightened atmospheric CO<sub>2</sub> levels. These climatic transformations are further compounded by concurrent ecological stressors, including habitat degradation, defaunation, and fragmentation. Against this backdrop, this study delves into the efficacy of advanced deep learning methodologies for the classification of land cover from satellite imagery, with a particular emphasis on agricultural crop monitoring. The study leverages state-of-the-art pre-trained Convolutional Neural Network (CNN) architectures, namely VGG16, MobileNetV2, DenseNet121, and ResNet50, selected for their architectural sophistication and proven competence in image recognition domains. The research framework encompasses a comprehensive data preparation phase incorporating augmentation techniques, a thorough exploratory data analysis to pinpoint and address class imbalances through the computation of class weights, and the strategic fine-tuning of CNN architectures with tailored classification layers to suit the specificities of land cover classification challenges. The models' performance was rigorously evaluated against benchmarks of accuracy and loss, both during the training phase and on validation datasets, with preventative strategies against overfitting, such as early stopping and adaptive learning rate modifications, being integral to the methodology. The findings illuminate the considerable potential of leveraging pre-trained deep learning models for remote sensing in agriculture, demonstrating that advanced CNN architectures, particularly DenseNet121 and ResNet50, are notably effective in enhancing crop type classification accuracy from satellite imagery. This study contributes valuable insights to the field of precision agriculture, advocating for the integration of sophisticated image recognition technologies to bolster crop monitoring efficacy, thereby enabling more nuanced agricultural decision-making and resource allocation.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100448"},"PeriodicalIF":3.3,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140282143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-02-27DOI: 10.1016/j.bdr.2024.100446
Christian Callegari, Stefano Giordano, Michele Pagano
{"title":"A Real Time Deep Learning Based Approach for Detecting Network Attacks","authors":"Christian Callegari, Stefano Giordano, Michele Pagano","doi":"10.1016/j.bdr.2024.100446","DOIUrl":"10.1016/j.bdr.2024.100446","url":null,"abstract":"<div><p>Anomaly-based Intrusion Detection is a key research topic in network security due to its ability to face unknown attacks and new security threats. For this reason, many works on the topic have been proposed in the last decade. Nonetheless, an ultimate solution, able to provide a high detection rate with an acceptable false alarm rate, has still to be identified. In the last years big research efforts have focused on the application of Deep Learning techniques to the field, but no work has been able, so far, to propose a system achieving good detection performance, while processing raw network traffic in real time. For this reason in the paper we propose an Intrusion Detection System that, leveraging on probabilistic data structures and Deep Learning techniques, is able to process in real time the traffic collected in a backbone network, offering <em>excellent</em> detection performance and low false alarm rate. Indeed, the extensive experimental tests, run to validate our system and compare different Deep Learning techniques, confirm that, with a proper parameter setting, we can achieve about 92% of detection rate, with an accuracy of 0.899. Finally, with minimal changes, the proposed system can provide some information about the kind of anomaly, although in the multi-class scenario the detection rate is slightly lower (around 86%).</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100446"},"PeriodicalIF":3.3,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2214579624000224/pdfft?md5=bbd19915547bc28f9b5784f2f0ddcb21&pid=1-s2.0-S2214579624000224-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140004622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-02-23DOI: 10.1016/j.bdr.2024.100447
Yandong Li, Bo Jiang, Long Zeng, Chenglong Li
{"title":"An Integration visual navigation algorithm for urban air mobility","authors":"Yandong Li, Bo Jiang, Long Zeng, Chenglong Li","doi":"10.1016/j.bdr.2024.100447","DOIUrl":"10.1016/j.bdr.2024.100447","url":null,"abstract":"<div><p>This paper presents an integration visual navigation algorithm called PnP-ORBSLAM for UAV position estimation in Urban Air Mobility (UAM). ORBSLAM is a popular and benchmark algorithm for vision based navigation applications. The proposed method improve the performance of ORBSLAM by adding a post-processing marker recognition phase to the model. Based on the features extracted from the markers, PnP algorithm is introduced to estimate the position of the monocular camera. The position estimation accuracy of the UAV is supposed to be improved by adding the position information of the camera to the model. Experiment is carried out based on Airsim simulation platform. Results show that the PnP-ORBSLAM algorithm can improve the three-dimensional accuracy by a margin of 5.38 % compared with ORBSLAM. In addition, the process speed of the proposed method can reach about 28 frames per second. It means that the PnP-ORBSLAM algorithm can work in real-time.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100447"},"PeriodicalIF":3.3,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139949248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-02-21DOI: 10.1016/j.bdr.2024.100443
Ahmad Bilal , Hamid Turab Mirza , Ibrar Hussain , Adnan Ahmad
{"title":"Investigating Influence of Google-Play Application Titles on Success","authors":"Ahmad Bilal , Hamid Turab Mirza , Ibrar Hussain , Adnan Ahmad","doi":"10.1016/j.bdr.2024.100443","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100443","url":null,"abstract":"<div><p>The title (name) is the primary information related to a mobile (smartphone) application, as it describes its functions and services. An eye-catching title can entice customers to choose a certain application over others. Application development companies are well aware of this phenomenon and invest significant efforts in crafting their application titles with compelling keywords, phrases and topics in pursuit of higher installs. However, to the best of our knowledge, traditional literature that investigates the impact of application titles on success is limited. There may be only a few instances where scientific (data-analytical) approaches have been used to examine application titles. Moreover, these investigations of titles are dominated by supervised learning and traditional literature may lack any unsupervised (cluster) data analysis techniques to measure the impact of titles on application success. Therefore, this research work proposes an unsupervised data analysis approach based on multiple layers and algorithms. The initial layer clusters the application titles, the subsequent layer extracts various textual features from these clusters and the final layer refines the extracted attributes. In general, certain textual features in the titles are proven to be positively and negatively linked with the application installs. Verification of the results has confirmed that this proposed approach can successfully detect the most prominent features from application titles (textual data) that correlate with success.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100443"},"PeriodicalIF":3.3,"publicationDate":"2024-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139935737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}