Big Data ResearchPub Date : 2024-04-23DOI: 10.1016/j.bdr.2024.100460
Aiguo Wang , Jun Wang , Haiming Li , Jian Hu , Haiyuan Zhou , Xinyu Zhang , Xuan Liu , Wanying Wang , Wenjin Zhang , Siting Wu , Ningyang Jiao , Yihao Wang
{"title":"Tree parameter extraction method based on new remote sensing technology and terrestrial laser scanning technology","authors":"Aiguo Wang , Jun Wang , Haiming Li , Jian Hu , Haiyuan Zhou , Xinyu Zhang , Xuan Liu , Wanying Wang , Wenjin Zhang , Siting Wu , Ningyang Jiao , Yihao Wang","doi":"10.1016/j.bdr.2024.100460","DOIUrl":"10.1016/j.bdr.2024.100460","url":null,"abstract":"<div><p>Ground LiDAR is a terrestrial LiDAR system that is often used for terrain and geomorphic mapping. Ground-based LiDAR can be used to collect more local and short-range data, making it ideal for mapping smaller areas with high precision. In order to solve the rapid extraction of tree parameters in the national public welfare forest survey, the ground-based LIDAR was used to obtain the point cloud of trees, and the point cloud data was registered, denoised, normalized, sliced, parameter extracted, etc., and the parameters of individual trees in the forest were obtained. The Bland-Altman consistency test is used to test whether the method of extracting tree parameters from point clouds is consistent with the traditional measurement method. The experimental results show that the point cloud data obtained by the ground-based LIDAR can quickly, conveniently and accurately extract the tree parameters, which is consistent with the traditional tree parameter extraction method, and has the advantages than the traditional tree parameter measurement, such as point cloud, image and traceability. It has a unique advantage in establishing a tree database. It is suggested that LIDAR should be used for forest survey in the future.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100460"},"PeriodicalIF":3.3,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140795530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-04-23DOI: 10.1016/j.bdr.2024.100457
Wei Zhang, Yu Dai
{"title":"A multiscale electricity theft detection model based on feature engineering","authors":"Wei Zhang, Yu Dai","doi":"10.1016/j.bdr.2024.100457","DOIUrl":"10.1016/j.bdr.2024.100457","url":null,"abstract":"<div><p>With the widespread adoption of smart meters and the growing availability of data mining and machine learning algorithms, there is a pressing demand for methods that are both accurate and explicable in identifying electricity theft patterns among end-users. To address this need, this study proposes a multi-scale anomaly detection model based on feature engineering.Specifically, tsfresh is utilized in feature engineering to extract electricity consumption features from the raw data, and XGBoost is employed to select features that are highly correlated with anomalous behavior, which have clear physical interpretations. Multi-scale convolutional neural networks are then used to analyze and process the data at different temporal and frequency scales. Attention mechanisms are applied to assign weights to different feature channels, and all of the extracted information is fused for anomaly detection. The combination of feature engineering and multi-scale convolutional neural networks not only enhances the interpretability of the model but also improves its performance, as demonstrated by the experimental results, which show that the proposed method outperforms traditional anomaly detection approaches across multiple evaluation metrics.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100457"},"PeriodicalIF":3.3,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140762245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-04-23DOI: 10.1016/j.bdr.2024.100458
Hongkun Xie , Minghua Huang , Wentao Lei , Yang Wang , Lu Ou
{"title":"Quantitative analysis of big data for land resource classification and zoning at the township level in Northern Shaanxi","authors":"Hongkun Xie , Minghua Huang , Wentao Lei , Yang Wang , Lu Ou","doi":"10.1016/j.bdr.2024.100458","DOIUrl":"10.1016/j.bdr.2024.100458","url":null,"abstract":"<div><p>To analyze and evaluate the conditions and distribution characteristics of rural land resources in northern Shaanxi. The experiment extracts two terrain feature values, namely slope and undulation, which are highly correlated with land resources. Then, the extraction results of all 302-township level administrative regions in northern Shaanxi are processed, and the scoring results of all township level units are sorted. Based on this, optimization and adjustment are made to form a classification result. The experimental results show that land resources in primary townships are most scarce, mainly distributed in the central and western regions of northern Shaanxi, with 53 in Yan'an and 7 in Yulin; Land resources in secondary townships are relatively scarce, mainly distributed along the Yellow River in the central and southern parts of northern Shaanxi, with 40 in Yan'an and 53 in Yulin; The land resources of third level townships are relatively abundant, generally distributed along the Great Wall, and belong to the transitional zone between windblown sand and grassland areas and hilly and gully areas. Except for one third level township located in Yan'an, all 22 other townships are located in Yulin; The fourth level townships have abundant land resources and are located in the loess plateau landform area in the southern part of northern Shaanxi. They belong to Yan'an Luochuan and three surrounding counties, totaling 17 townships; The terrain of the fifth and sixth level townships is flat, and the land resources are the most abundant. They belong to the sandy and grassy terrain north of the Great Wall in northern Shaanxi. A total of 56 townships are located in 7 county-level administrative regions of Yulin City. The experimental results lay the foundation for the research on optimizing the spatial pattern of rural life in northern Shaanxi, and can also provide support for classified guidance and precise policy implementation for rural revitalization, agricultural industry policy formulation, human settlement environment construction, and ecological environment protection.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100458"},"PeriodicalIF":3.3,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140788495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-04-16DOI: 10.1016/j.bdr.2024.100454
Jörg Raab , Yuting Pang , Joan Baaijens , Honggeng Zhou
{"title":"Big Data in organizations: Exploring the adoption of Big Data applications and their impact on organizations in China and the Netherlands","authors":"Jörg Raab , Yuting Pang , Joan Baaijens , Honggeng Zhou","doi":"10.1016/j.bdr.2024.100454","DOIUrl":"10.1016/j.bdr.2024.100454","url":null,"abstract":"<div><p>Digital technology has rapidly been transforming how organizations operate. However, the literature in management studies has only just started to problematize the fundamental inter-relation of digital technology and organizing and we lack sound data about the actual breadth and depth of these changes. This study therefore explores the state of the implementation of Big Data applications in a wide range of organizations in China and the Netherlands and the impact on organizational structures and processes. Our findings show that most organizations are still in an experimental phase at best. We can therefore observe an evolutionary model of technology adoption</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100454"},"PeriodicalIF":3.3,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140796332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Machine Learning for Tsunami Waves Forecasting Using Regression Trees","authors":"Eugenio Cesario , Salvatore Giampá , Enrico Baglione , Louise Cordrie , Jacopo Selva , Domenico Talia","doi":"10.1016/j.bdr.2024.100452","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100452","url":null,"abstract":"<div><p>After a seismic event, tsunami early warning systems (TEWSs) try to accurately forecast the maximum height of incident waves at specific target points in front of the coast, so that early warnings can be launched on locations where the impact of tsunami waves can be destructive to deliver aids in these locations in the immediate post-event management. The uncertainty on the forecast can be quantified with ensembles of alternative scenarios. Similarly, in probabilistic tsunami hazard analysis (PTHA) a large number of simulations is required to cover the natural variability of the source process in each location. To improve the accuracy and computational efficiency of tsunami forecasting methods, scientists have recently started to exploit machine learning techniques to process pre-computed simulation data. However, the approaches proposed in literature, mainly based on neural networks, suffer of high training time and limited model explainability. To overtake these issues, this paper describes a machine learning approach based on regression trees to model and forecast tsunami evolutions. The algorithm takes as input a set of simulations forming an ensemble that describes potential benefit regional impact of tsunami source scenarios in a given source area, and it provides predictive models to forecast the tsunami waves for other potential tsunami sources in the same area. The experimental evaluation, performed on the 2003 M6.8 Zemmouri-Boumerdes earthquake and tsunami simulation data, shows that regression trees achieve high forecasting accuracy. Moreover, they provide domain experts with fully-explainable and interpretable models, which are a valuable support for environmental scientists because they describe underlying rules and patterns behind the models and allow for an explicit inspection of their functioning. This can enable a full and trustable exploration of source uncertainty in tsunami early-warning and urgent computing scenarios, with large ensembles of computationally light tsunami simulations.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100452"},"PeriodicalIF":3.3,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2214579624000285/pdfft?md5=942e994d950c715c0c020e511bc26341&pid=1-s2.0-S2214579624000285-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140559033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-04-04DOI: 10.1016/j.bdr.2024.100453
Helen Karatza
{"title":"Scheduling critical periodic jobs with selective partial computations along with gang jobs","authors":"Helen Karatza","doi":"10.1016/j.bdr.2024.100453","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100453","url":null,"abstract":"<div><p>One of the main issues with distributed systems, like clouds, is scheduling complex workloads, which are made up of various job types with distinct features. Gang jobs are one kind of parallel applications that these systems support. This paper examines the scheduling of workloads that comprise gangs and critical periodic jobs that can allow for partial computations when necessary to overcome gang job execution. The simulation's results shed important light on how gang performance is impacted by partial computations of critical jobs. The results also reveal that, under the proposed scheduling scheme, partial computations which take into account gangs’ degree of parallelism, might lower the average response time of gang jobs, resulting in an acceptable level of the average results precision of the critical jobs. Additionally, it is observed that as the deviation from the average partial computation increases, the performance improvement due to partial computations increases with the aforementioned tradeoff remaining significant.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100453"},"PeriodicalIF":3.3,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140547395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-03-26DOI: 10.1016/j.bdr.2024.100451
Anli Yan , Xiaozhang Liu , Wanman Li , Hongwei Ye , Lang Li
{"title":"Explanation-Guided Adversarial Example Attacks","authors":"Anli Yan , Xiaozhang Liu , Wanman Li , Hongwei Ye , Lang Li","doi":"10.1016/j.bdr.2024.100451","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100451","url":null,"abstract":"<div><p>Neural network-based classifiers are vulnerable to adversarial example attacks even in a black-box setting. Existing adversarial example generation technologies mainly rely on optimization-based attacks, which optimize the objective function by iterative input perturbation. While being able to craft adversarial examples, these techniques require big budgets. Latest transfer-based attacks, though being limited queries, also have a disadvantage of low attack success rate. In this paper, we propose an adversarial example attack method called MEAttack using the model-agnostic explanation technology, which can more efficiently generate adversarial examples in the black-box setting with limited queries. The core idea is to design a novel model-agnostic explanation method for target models, and generate adversarial examples based on model explanations. We experimentally demonstrate that MEAttack outperforms the state-of-the-art attack technology, i.e., AutoZOOM. The success rate of MEAttack is 4.54%-47.42% higher than AutoZOOM, and its query efficiency is reduced by 2.6-4.2 times. Experimental results show that MEAttack is efficient in terms of both attack success rate and query efficiency.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100451"},"PeriodicalIF":3.3,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140347942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-03-21DOI: 10.1016/j.bdr.2024.100450
Shichen Zhai , Xiaoping Lu , Chao Wang , Zhiyu Hong , Jing Shan , Zongmin Ma
{"title":"Correcting inconsistencies in knowledge graphs with correlated knowledge","authors":"Shichen Zhai , Xiaoping Lu , Chao Wang , Zhiyu Hong , Jing Shan , Zongmin Ma","doi":"10.1016/j.bdr.2024.100450","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100450","url":null,"abstract":"<div><p>Knowledge graphs (KGs) have been widely applied for semantic representation and intelligent decision-making. The usefulness and usability of KGs is often limited by quality of KGs. One common issue is the presence of inconsistent assertions in KGs. Inconsistencies in KGs are often caused by diverse data that are applied for automatically constructing large-scale KGs. To improve quality of KGs, in this paper, we investigate how to detect and correct inconsistent triples in KGs. We first identify entity-related inconsistency, relation-related inconsistency and type-related inconsistency. On the basis, we propose a framework of correcting the identified inconsistencies, which combines candidate generation, link prediction and constraint validation. We evaluate the proposed correction framework in the real-word dataset FB15k (from Freebase). The promising results confirm the capability of our framework in correcting the inconsistencies of knowledge graphs.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100450"},"PeriodicalIF":3.3,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140328544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-03-20DOI: 10.1016/j.bdr.2024.100449
Zehua Liu , Jiuhao Li , Mahmood Ashraf , M.S. Syam , Muhammad Asif , Emad Mahrous Awwad , Muna Al-Razgan , Uzair Aslam Bhatti
{"title":"Remote sensing-enhanced transfer learning approach for agricultural damage and change detection: A deep learning perspective","authors":"Zehua Liu , Jiuhao Li , Mahmood Ashraf , M.S. Syam , Muhammad Asif , Emad Mahrous Awwad , Muna Al-Razgan , Uzair Aslam Bhatti","doi":"10.1016/j.bdr.2024.100449","DOIUrl":"10.1016/j.bdr.2024.100449","url":null,"abstract":"<div><p>With the continuous advancement of science and technology, there has been a growing awareness of safety among people worldwide. Natural disasters such as wildfires, earthquakes, and floods pose persistent threats to both lives and property on our planet, which serves as our fundamental habitat. While it is impossible to prevent or entirely avert these calamities, rapid identification of affected areas and prompt damage assessment post-disaster can significantly aid in the formulation of effective rescue strategies, ultimately saving more lives. This article delves into the application of transfer learning in satellite image damage assessment—a methodology that involves transferring previously acquired knowledge to enhance a model's adaptability to new tasks. Given the limited availability of datasets for satellite image analysis, transfer learning proves to be an effective approach. Specifically, the study proposes a transfer learning method based on YOLOv5 for satellite image damage assessment. Initially, a general convolutional neural network model is trained using a substantial dataset of natural images. Subsequently, the early layers of this model are frozen, while the later layers undergo training to adapt to satellite image data. Fine-tuning is then employed to further enhance the overall model performance. The results demonstrate that this approach yields a high accuracy rate in satellite image damage assessment. Moreover, compared to conventional deep learning methods, the proposed method effectively leverages pre-trained models' knowledge, thereby reducing data dependency. Additionally, it displays robust generalization capabilities across diverse tasks and datasets, underscoring its potential for facilitating transfer learning across various domains.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100449"},"PeriodicalIF":3.3,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140275813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-03-20DOI: 10.1016/j.bdr.2024.100448
Min Peng , Yunxiang Liu , Asad Khan , Bilal Ahmed , Subrata K. Sarker , Yazeed Yasin Ghadi , Uzair Aslam Bhatti , Muna Al-Razgan , Yasser A. Ali
{"title":"Crop monitoring using remote sensing land use and land change data: Comparative analysis of deep learning methods using pre-trained CNN models","authors":"Min Peng , Yunxiang Liu , Asad Khan , Bilal Ahmed , Subrata K. Sarker , Yazeed Yasin Ghadi , Uzair Aslam Bhatti , Muna Al-Razgan , Yasser A. Ali","doi":"10.1016/j.bdr.2024.100448","DOIUrl":"10.1016/j.bdr.2024.100448","url":null,"abstract":"<div><p>In the context of the rapidly evolving climate dynamics of the early twenty-first century, the interplay between climate change and biospheric integrity is becoming increasingly critical. The pervasive impact of climate change on ecosystems is manifested not only through alterations in average environmental conditions and their variability but also through ancillary shifts such as escalated oceanic acidification and heightened atmospheric CO<sub>2</sub> levels. These climatic transformations are further compounded by concurrent ecological stressors, including habitat degradation, defaunation, and fragmentation. Against this backdrop, this study delves into the efficacy of advanced deep learning methodologies for the classification of land cover from satellite imagery, with a particular emphasis on agricultural crop monitoring. The study leverages state-of-the-art pre-trained Convolutional Neural Network (CNN) architectures, namely VGG16, MobileNetV2, DenseNet121, and ResNet50, selected for their architectural sophistication and proven competence in image recognition domains. The research framework encompasses a comprehensive data preparation phase incorporating augmentation techniques, a thorough exploratory data analysis to pinpoint and address class imbalances through the computation of class weights, and the strategic fine-tuning of CNN architectures with tailored classification layers to suit the specificities of land cover classification challenges. The models' performance was rigorously evaluated against benchmarks of accuracy and loss, both during the training phase and on validation datasets, with preventative strategies against overfitting, such as early stopping and adaptive learning rate modifications, being integral to the methodology. The findings illuminate the considerable potential of leveraging pre-trained deep learning models for remote sensing in agriculture, demonstrating that advanced CNN architectures, particularly DenseNet121 and ResNet50, are notably effective in enhancing crop type classification accuracy from satellite imagery. This study contributes valuable insights to the field of precision agriculture, advocating for the integration of sophisticated image recognition technologies to bolster crop monitoring efficacy, thereby enabling more nuanced agricultural decision-making and resource allocation.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100448"},"PeriodicalIF":3.3,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140282143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}