Big Data ResearchPub Date : 2024-04-30DOI: 10.1016/j.bdr.2024.100462
Li Deng , Shihu Liu , Weihua Xu , Xianghong Lin
{"title":"Similarity Measurement for Graph Data: An Improved Centrality and Geometric Perspective-Based Approach","authors":"Li Deng , Shihu Liu , Weihua Xu , Xianghong Lin","doi":"10.1016/j.bdr.2024.100462","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100462","url":null,"abstract":"<div><p>How to make a precise similarity measurement for graph data is considered as highly recommended research in many fields. Hereinto, the so-named graph data is the coalition of patterns and edges that connect patterns. By taking both of pattern information and edge information into consideration, this paper introduces an improved centrality and geometric perspective-based approach to measure the similarity between any two graph data. Once these two graph data are projected into a plane, the pattern distance can be calculated by Euclid metric. With the help of the area composed by length of each edge and angle that constructed by the positive X-axis and the edge, the area-based edge distance is computed. To get better measurement, position-based edge distance is used to modify the edge distance. Up to now, the global distance between any two graph data can be determined by combining the above mentioned two distance results. Finally, the <span>letter dataset</span> is applied for experiment to examine the proposed similarity approach. The experimental results show that the proposed approach captures the similarity of graph data commendably and gets a tradeoff between time and precision.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100462"},"PeriodicalIF":3.3,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140824127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-04-26DOI: 10.1016/j.bdr.2024.100455
Ricardo de A. Araújo , Paulo S.G. de Mattos Neto , Nadia Nedjah , Sergio C.B. Soares
{"title":"On the Sea Surface Temperature Forecasting Problem with Deep Dilation-Erosion-Linear Models","authors":"Ricardo de A. Araújo , Paulo S.G. de Mattos Neto , Nadia Nedjah , Sergio C.B. Soares","doi":"10.1016/j.bdr.2024.100455","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100455","url":null,"abstract":"<div><p>The sea surface temperature (SST) is considered an important measure for detecting changes in climate and marine ecosystems. So, its forecasting is essential for supporting governmental strategies to avoid side effects on the global population. In this paper, we analyze the SST time series and suggest that a combination between a linear component and a nonlinear component with long-term dependency can better represent it. Based on this assumption, we propose a deep neural network architecture with dilation-erosion-linear (DEL) processing units to deal with this particular kind of time series. An empirical analysis is performed in this work using three SST time series, where we explore three statistical measures. The experimental results demonstrate that the proposed model outperformed recent and classical literature forecasting techniques according to well-known performance metrics.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100455"},"PeriodicalIF":3.3,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140813373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-04-25DOI: 10.1016/j.bdr.2024.100459
Lei Shi , Yimin Zhou , Wei Wang , Juan Wang , Yang Bai , Chengzong Peng , Ding Chen , Zuli Wang
{"title":"A Cross-Chain Mechanism for Agricultural Engineering Document Management Blockchain in the Context of Big Data","authors":"Lei Shi , Yimin Zhou , Wei Wang , Juan Wang , Yang Bai , Chengzong Peng , Ding Chen , Zuli Wang","doi":"10.1016/j.bdr.2024.100459","DOIUrl":"10.1016/j.bdr.2024.100459","url":null,"abstract":"<div><p>Cross-chain mechanism functions as typical approaches for information interaction between diverse blockchains tackling the problem of information silos in the big data era. Most of the existing cross-chain mechanisms are targeted at virtual currency blockchains in the financial sector. With more and more engineering documents manufactured by the development of modern smart farming, the need for engineering document management and cross-chaining between various blockchains has become increasingly urgent. This paper proposes a novel attainable cross-chain mechanism for agricultural engineering document management blockchains concerning the unique structure and operation principals of the specific domain. The methodology sufficiently integrated the characteristics of the agricultural engineering document management with the notary scheme, constructed by government supervision nodes with high credibility. Meanwhile, the authentication technology and cryptographic algorithms are internally fused, solving the authentication problem of the document cross-chain and protecting the cross-chain information respectively, which ensures the integrity and security of the file attribute information, alongside file ontology data in the cross-chain process. Adequate security proof and experiments illustrate that the developed mechanism can guarantee the feasibility of the mechanism, authenticity of the cross-chain parties, and the integrality and reliability of the document information, thus catering to the requirements of the cross-chain performance of blockchain in the field of agricultural engineering document management.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100459"},"PeriodicalIF":3.3,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140782467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-04-23DOI: 10.1016/j.bdr.2024.100460
Aiguo Wang , Jun Wang , Haiming Li , Jian Hu , Haiyuan Zhou , Xinyu Zhang , Xuan Liu , Wanying Wang , Wenjin Zhang , Siting Wu , Ningyang Jiao , Yihao Wang
{"title":"Tree parameter extraction method based on new remote sensing technology and terrestrial laser scanning technology","authors":"Aiguo Wang , Jun Wang , Haiming Li , Jian Hu , Haiyuan Zhou , Xinyu Zhang , Xuan Liu , Wanying Wang , Wenjin Zhang , Siting Wu , Ningyang Jiao , Yihao Wang","doi":"10.1016/j.bdr.2024.100460","DOIUrl":"10.1016/j.bdr.2024.100460","url":null,"abstract":"<div><p>Ground LiDAR is a terrestrial LiDAR system that is often used for terrain and geomorphic mapping. Ground-based LiDAR can be used to collect more local and short-range data, making it ideal for mapping smaller areas with high precision. In order to solve the rapid extraction of tree parameters in the national public welfare forest survey, the ground-based LIDAR was used to obtain the point cloud of trees, and the point cloud data was registered, denoised, normalized, sliced, parameter extracted, etc., and the parameters of individual trees in the forest were obtained. The Bland-Altman consistency test is used to test whether the method of extracting tree parameters from point clouds is consistent with the traditional measurement method. The experimental results show that the point cloud data obtained by the ground-based LIDAR can quickly, conveniently and accurately extract the tree parameters, which is consistent with the traditional tree parameter extraction method, and has the advantages than the traditional tree parameter measurement, such as point cloud, image and traceability. It has a unique advantage in establishing a tree database. It is suggested that LIDAR should be used for forest survey in the future.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100460"},"PeriodicalIF":3.3,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140795530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-04-23DOI: 10.1016/j.bdr.2024.100457
Wei Zhang, Yu Dai
{"title":"A multiscale electricity theft detection model based on feature engineering","authors":"Wei Zhang, Yu Dai","doi":"10.1016/j.bdr.2024.100457","DOIUrl":"10.1016/j.bdr.2024.100457","url":null,"abstract":"<div><p>With the widespread adoption of smart meters and the growing availability of data mining and machine learning algorithms, there is a pressing demand for methods that are both accurate and explicable in identifying electricity theft patterns among end-users. To address this need, this study proposes a multi-scale anomaly detection model based on feature engineering.Specifically, tsfresh is utilized in feature engineering to extract electricity consumption features from the raw data, and XGBoost is employed to select features that are highly correlated with anomalous behavior, which have clear physical interpretations. Multi-scale convolutional neural networks are then used to analyze and process the data at different temporal and frequency scales. Attention mechanisms are applied to assign weights to different feature channels, and all of the extracted information is fused for anomaly detection. The combination of feature engineering and multi-scale convolutional neural networks not only enhances the interpretability of the model but also improves its performance, as demonstrated by the experimental results, which show that the proposed method outperforms traditional anomaly detection approaches across multiple evaluation metrics.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100457"},"PeriodicalIF":3.3,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140762245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-04-23DOI: 10.1016/j.bdr.2024.100458
Hongkun Xie , Minghua Huang , Wentao Lei , Yang Wang , Lu Ou
{"title":"Quantitative analysis of big data for land resource classification and zoning at the township level in Northern Shaanxi","authors":"Hongkun Xie , Minghua Huang , Wentao Lei , Yang Wang , Lu Ou","doi":"10.1016/j.bdr.2024.100458","DOIUrl":"10.1016/j.bdr.2024.100458","url":null,"abstract":"<div><p>To analyze and evaluate the conditions and distribution characteristics of rural land resources in northern Shaanxi. The experiment extracts two terrain feature values, namely slope and undulation, which are highly correlated with land resources. Then, the extraction results of all 302-township level administrative regions in northern Shaanxi are processed, and the scoring results of all township level units are sorted. Based on this, optimization and adjustment are made to form a classification result. The experimental results show that land resources in primary townships are most scarce, mainly distributed in the central and western regions of northern Shaanxi, with 53 in Yan'an and 7 in Yulin; Land resources in secondary townships are relatively scarce, mainly distributed along the Yellow River in the central and southern parts of northern Shaanxi, with 40 in Yan'an and 53 in Yulin; The land resources of third level townships are relatively abundant, generally distributed along the Great Wall, and belong to the transitional zone between windblown sand and grassland areas and hilly and gully areas. Except for one third level township located in Yan'an, all 22 other townships are located in Yulin; The fourth level townships have abundant land resources and are located in the loess plateau landform area in the southern part of northern Shaanxi. They belong to Yan'an Luochuan and three surrounding counties, totaling 17 townships; The terrain of the fifth and sixth level townships is flat, and the land resources are the most abundant. They belong to the sandy and grassy terrain north of the Great Wall in northern Shaanxi. A total of 56 townships are located in 7 county-level administrative regions of Yulin City. The experimental results lay the foundation for the research on optimizing the spatial pattern of rural life in northern Shaanxi, and can also provide support for classified guidance and precise policy implementation for rural revitalization, agricultural industry policy formulation, human settlement environment construction, and ecological environment protection.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100458"},"PeriodicalIF":3.3,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140788495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-04-16DOI: 10.1016/j.bdr.2024.100454
Jörg Raab , Yuting Pang , Joan Baaijens , Honggeng Zhou
{"title":"Big Data in organizations: Exploring the adoption of Big Data applications and their impact on organizations in China and the Netherlands","authors":"Jörg Raab , Yuting Pang , Joan Baaijens , Honggeng Zhou","doi":"10.1016/j.bdr.2024.100454","DOIUrl":"10.1016/j.bdr.2024.100454","url":null,"abstract":"<div><p>Digital technology has rapidly been transforming how organizations operate. However, the literature in management studies has only just started to problematize the fundamental inter-relation of digital technology and organizing and we lack sound data about the actual breadth and depth of these changes. This study therefore explores the state of the implementation of Big Data applications in a wide range of organizations in China and the Netherlands and the impact on organizational structures and processes. Our findings show that most organizations are still in an experimental phase at best. We can therefore observe an evolutionary model of technology adoption</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100454"},"PeriodicalIF":3.3,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140796332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Machine Learning for Tsunami Waves Forecasting Using Regression Trees","authors":"Eugenio Cesario , Salvatore Giampá , Enrico Baglione , Louise Cordrie , Jacopo Selva , Domenico Talia","doi":"10.1016/j.bdr.2024.100452","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100452","url":null,"abstract":"<div><p>After a seismic event, tsunami early warning systems (TEWSs) try to accurately forecast the maximum height of incident waves at specific target points in front of the coast, so that early warnings can be launched on locations where the impact of tsunami waves can be destructive to deliver aids in these locations in the immediate post-event management. The uncertainty on the forecast can be quantified with ensembles of alternative scenarios. Similarly, in probabilistic tsunami hazard analysis (PTHA) a large number of simulations is required to cover the natural variability of the source process in each location. To improve the accuracy and computational efficiency of tsunami forecasting methods, scientists have recently started to exploit machine learning techniques to process pre-computed simulation data. However, the approaches proposed in literature, mainly based on neural networks, suffer of high training time and limited model explainability. To overtake these issues, this paper describes a machine learning approach based on regression trees to model and forecast tsunami evolutions. The algorithm takes as input a set of simulations forming an ensemble that describes potential benefit regional impact of tsunami source scenarios in a given source area, and it provides predictive models to forecast the tsunami waves for other potential tsunami sources in the same area. The experimental evaluation, performed on the 2003 M6.8 Zemmouri-Boumerdes earthquake and tsunami simulation data, shows that regression trees achieve high forecasting accuracy. Moreover, they provide domain experts with fully-explainable and interpretable models, which are a valuable support for environmental scientists because they describe underlying rules and patterns behind the models and allow for an explicit inspection of their functioning. This can enable a full and trustable exploration of source uncertainty in tsunami early-warning and urgent computing scenarios, with large ensembles of computationally light tsunami simulations.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100452"},"PeriodicalIF":3.3,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2214579624000285/pdfft?md5=942e994d950c715c0c020e511bc26341&pid=1-s2.0-S2214579624000285-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140559033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-04-04DOI: 10.1016/j.bdr.2024.100453
Helen Karatza
{"title":"Scheduling critical periodic jobs with selective partial computations along with gang jobs","authors":"Helen Karatza","doi":"10.1016/j.bdr.2024.100453","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100453","url":null,"abstract":"<div><p>One of the main issues with distributed systems, like clouds, is scheduling complex workloads, which are made up of various job types with distinct features. Gang jobs are one kind of parallel applications that these systems support. This paper examines the scheduling of workloads that comprise gangs and critical periodic jobs that can allow for partial computations when necessary to overcome gang job execution. The simulation's results shed important light on how gang performance is impacted by partial computations of critical jobs. The results also reveal that, under the proposed scheduling scheme, partial computations which take into account gangs’ degree of parallelism, might lower the average response time of gang jobs, resulting in an acceptable level of the average results precision of the critical jobs. Additionally, it is observed that as the deviation from the average partial computation increases, the performance improvement due to partial computations increases with the aforementioned tradeoff remaining significant.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100453"},"PeriodicalIF":3.3,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140547395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-03-26DOI: 10.1016/j.bdr.2024.100451
Anli Yan , Xiaozhang Liu , Wanman Li , Hongwei Ye , Lang Li
{"title":"Explanation-Guided Adversarial Example Attacks","authors":"Anli Yan , Xiaozhang Liu , Wanman Li , Hongwei Ye , Lang Li","doi":"10.1016/j.bdr.2024.100451","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100451","url":null,"abstract":"<div><p>Neural network-based classifiers are vulnerable to adversarial example attacks even in a black-box setting. Existing adversarial example generation technologies mainly rely on optimization-based attacks, which optimize the objective function by iterative input perturbation. While being able to craft adversarial examples, these techniques require big budgets. Latest transfer-based attacks, though being limited queries, also have a disadvantage of low attack success rate. In this paper, we propose an adversarial example attack method called MEAttack using the model-agnostic explanation technology, which can more efficiently generate adversarial examples in the black-box setting with limited queries. The core idea is to design a novel model-agnostic explanation method for target models, and generate adversarial examples based on model explanations. We experimentally demonstrate that MEAttack outperforms the state-of-the-art attack technology, i.e., AutoZOOM. The success rate of MEAttack is 4.54%-47.42% higher than AutoZOOM, and its query efficiency is reduced by 2.6-4.2 times. Experimental results show that MEAttack is efficient in terms of both attack success rate and query efficiency.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100451"},"PeriodicalIF":3.3,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140347942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}