Big Data ResearchPub Date : 2024-03-21DOI: 10.1016/j.bdr.2024.100450
Shichen Zhai , Xiaoping Lu , Chao Wang , Zhiyu Hong , Jing Shan , Zongmin Ma
{"title":"Correcting inconsistencies in knowledge graphs with correlated knowledge","authors":"Shichen Zhai , Xiaoping Lu , Chao Wang , Zhiyu Hong , Jing Shan , Zongmin Ma","doi":"10.1016/j.bdr.2024.100450","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100450","url":null,"abstract":"<div><p>Knowledge graphs (KGs) have been widely applied for semantic representation and intelligent decision-making. The usefulness and usability of KGs is often limited by quality of KGs. One common issue is the presence of inconsistent assertions in KGs. Inconsistencies in KGs are often caused by diverse data that are applied for automatically constructing large-scale KGs. To improve quality of KGs, in this paper, we investigate how to detect and correct inconsistent triples in KGs. We first identify entity-related inconsistency, relation-related inconsistency and type-related inconsistency. On the basis, we propose a framework of correcting the identified inconsistencies, which combines candidate generation, link prediction and constraint validation. We evaluate the proposed correction framework in the real-word dataset FB15k (from Freebase). The promising results confirm the capability of our framework in correcting the inconsistencies of knowledge graphs.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100450"},"PeriodicalIF":3.3,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140328544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-03-20DOI: 10.1016/j.bdr.2024.100449
Zehua Liu , Jiuhao Li , Mahmood Ashraf , M.S. Syam , Muhammad Asif , Emad Mahrous Awwad , Muna Al-Razgan , Uzair Aslam Bhatti
{"title":"Remote sensing-enhanced transfer learning approach for agricultural damage and change detection: A deep learning perspective","authors":"Zehua Liu , Jiuhao Li , Mahmood Ashraf , M.S. Syam , Muhammad Asif , Emad Mahrous Awwad , Muna Al-Razgan , Uzair Aslam Bhatti","doi":"10.1016/j.bdr.2024.100449","DOIUrl":"10.1016/j.bdr.2024.100449","url":null,"abstract":"<div><p>With the continuous advancement of science and technology, there has been a growing awareness of safety among people worldwide. Natural disasters such as wildfires, earthquakes, and floods pose persistent threats to both lives and property on our planet, which serves as our fundamental habitat. While it is impossible to prevent or entirely avert these calamities, rapid identification of affected areas and prompt damage assessment post-disaster can significantly aid in the formulation of effective rescue strategies, ultimately saving more lives. This article delves into the application of transfer learning in satellite image damage assessment—a methodology that involves transferring previously acquired knowledge to enhance a model's adaptability to new tasks. Given the limited availability of datasets for satellite image analysis, transfer learning proves to be an effective approach. Specifically, the study proposes a transfer learning method based on YOLOv5 for satellite image damage assessment. Initially, a general convolutional neural network model is trained using a substantial dataset of natural images. Subsequently, the early layers of this model are frozen, while the later layers undergo training to adapt to satellite image data. Fine-tuning is then employed to further enhance the overall model performance. The results demonstrate that this approach yields a high accuracy rate in satellite image damage assessment. Moreover, compared to conventional deep learning methods, the proposed method effectively leverages pre-trained models' knowledge, thereby reducing data dependency. Additionally, it displays robust generalization capabilities across diverse tasks and datasets, underscoring its potential for facilitating transfer learning across various domains.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100449"},"PeriodicalIF":3.3,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140275813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-03-20DOI: 10.1016/j.bdr.2024.100448
Min Peng , Yunxiang Liu , Asad Khan , Bilal Ahmed , Subrata K. Sarker , Yazeed Yasin Ghadi , Uzair Aslam Bhatti , Muna Al-Razgan , Yasser A. Ali
{"title":"Crop monitoring using remote sensing land use and land change data: Comparative analysis of deep learning methods using pre-trained CNN models","authors":"Min Peng , Yunxiang Liu , Asad Khan , Bilal Ahmed , Subrata K. Sarker , Yazeed Yasin Ghadi , Uzair Aslam Bhatti , Muna Al-Razgan , Yasser A. Ali","doi":"10.1016/j.bdr.2024.100448","DOIUrl":"10.1016/j.bdr.2024.100448","url":null,"abstract":"<div><p>In the context of the rapidly evolving climate dynamics of the early twenty-first century, the interplay between climate change and biospheric integrity is becoming increasingly critical. The pervasive impact of climate change on ecosystems is manifested not only through alterations in average environmental conditions and their variability but also through ancillary shifts such as escalated oceanic acidification and heightened atmospheric CO<sub>2</sub> levels. These climatic transformations are further compounded by concurrent ecological stressors, including habitat degradation, defaunation, and fragmentation. Against this backdrop, this study delves into the efficacy of advanced deep learning methodologies for the classification of land cover from satellite imagery, with a particular emphasis on agricultural crop monitoring. The study leverages state-of-the-art pre-trained Convolutional Neural Network (CNN) architectures, namely VGG16, MobileNetV2, DenseNet121, and ResNet50, selected for their architectural sophistication and proven competence in image recognition domains. The research framework encompasses a comprehensive data preparation phase incorporating augmentation techniques, a thorough exploratory data analysis to pinpoint and address class imbalances through the computation of class weights, and the strategic fine-tuning of CNN architectures with tailored classification layers to suit the specificities of land cover classification challenges. The models' performance was rigorously evaluated against benchmarks of accuracy and loss, both during the training phase and on validation datasets, with preventative strategies against overfitting, such as early stopping and adaptive learning rate modifications, being integral to the methodology. The findings illuminate the considerable potential of leveraging pre-trained deep learning models for remote sensing in agriculture, demonstrating that advanced CNN architectures, particularly DenseNet121 and ResNet50, are notably effective in enhancing crop type classification accuracy from satellite imagery. This study contributes valuable insights to the field of precision agriculture, advocating for the integration of sophisticated image recognition technologies to bolster crop monitoring efficacy, thereby enabling more nuanced agricultural decision-making and resource allocation.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100448"},"PeriodicalIF":3.3,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140282143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-02-27DOI: 10.1016/j.bdr.2024.100446
Christian Callegari, Stefano Giordano, Michele Pagano
{"title":"A Real Time Deep Learning Based Approach for Detecting Network Attacks","authors":"Christian Callegari, Stefano Giordano, Michele Pagano","doi":"10.1016/j.bdr.2024.100446","DOIUrl":"10.1016/j.bdr.2024.100446","url":null,"abstract":"<div><p>Anomaly-based Intrusion Detection is a key research topic in network security due to its ability to face unknown attacks and new security threats. For this reason, many works on the topic have been proposed in the last decade. Nonetheless, an ultimate solution, able to provide a high detection rate with an acceptable false alarm rate, has still to be identified. In the last years big research efforts have focused on the application of Deep Learning techniques to the field, but no work has been able, so far, to propose a system achieving good detection performance, while processing raw network traffic in real time. For this reason in the paper we propose an Intrusion Detection System that, leveraging on probabilistic data structures and Deep Learning techniques, is able to process in real time the traffic collected in a backbone network, offering <em>excellent</em> detection performance and low false alarm rate. Indeed, the extensive experimental tests, run to validate our system and compare different Deep Learning techniques, confirm that, with a proper parameter setting, we can achieve about 92% of detection rate, with an accuracy of 0.899. Finally, with minimal changes, the proposed system can provide some information about the kind of anomaly, although in the multi-class scenario the detection rate is slightly lower (around 86%).</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100446"},"PeriodicalIF":3.3,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2214579624000224/pdfft?md5=bbd19915547bc28f9b5784f2f0ddcb21&pid=1-s2.0-S2214579624000224-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140004622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-02-23DOI: 10.1016/j.bdr.2024.100447
Yandong Li, Bo Jiang, Long Zeng, Chenglong Li
{"title":"An Integration visual navigation algorithm for urban air mobility","authors":"Yandong Li, Bo Jiang, Long Zeng, Chenglong Li","doi":"10.1016/j.bdr.2024.100447","DOIUrl":"10.1016/j.bdr.2024.100447","url":null,"abstract":"<div><p>This paper presents an integration visual navigation algorithm called PnP-ORBSLAM for UAV position estimation in Urban Air Mobility (UAM). ORBSLAM is a popular and benchmark algorithm for vision based navigation applications. The proposed method improve the performance of ORBSLAM by adding a post-processing marker recognition phase to the model. Based on the features extracted from the markers, PnP algorithm is introduced to estimate the position of the monocular camera. The position estimation accuracy of the UAV is supposed to be improved by adding the position information of the camera to the model. Experiment is carried out based on Airsim simulation platform. Results show that the PnP-ORBSLAM algorithm can improve the three-dimensional accuracy by a margin of 5.38 % compared with ORBSLAM. In addition, the process speed of the proposed method can reach about 28 frames per second. It means that the PnP-ORBSLAM algorithm can work in real-time.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100447"},"PeriodicalIF":3.3,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139949248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-02-21DOI: 10.1016/j.bdr.2024.100443
Ahmad Bilal , Hamid Turab Mirza , Ibrar Hussain , Adnan Ahmad
{"title":"Investigating Influence of Google-Play Application Titles on Success","authors":"Ahmad Bilal , Hamid Turab Mirza , Ibrar Hussain , Adnan Ahmad","doi":"10.1016/j.bdr.2024.100443","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100443","url":null,"abstract":"<div><p>The title (name) is the primary information related to a mobile (smartphone) application, as it describes its functions and services. An eye-catching title can entice customers to choose a certain application over others. Application development companies are well aware of this phenomenon and invest significant efforts in crafting their application titles with compelling keywords, phrases and topics in pursuit of higher installs. However, to the best of our knowledge, traditional literature that investigates the impact of application titles on success is limited. There may be only a few instances where scientific (data-analytical) approaches have been used to examine application titles. Moreover, these investigations of titles are dominated by supervised learning and traditional literature may lack any unsupervised (cluster) data analysis techniques to measure the impact of titles on application success. Therefore, this research work proposes an unsupervised data analysis approach based on multiple layers and algorithms. The initial layer clusters the application titles, the subsequent layer extracts various textual features from these clusters and the final layer refines the extracted attributes. In general, certain textual features in the titles are proven to be positively and negatively linked with the application installs. Verification of the results has confirmed that this proposed approach can successfully detect the most prominent features from application titles (textual data) that correlate with success.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100443"},"PeriodicalIF":3.3,"publicationDate":"2024-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139935737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-02-19DOI: 10.1016/j.bdr.2024.100441
Zhou Shao , Sha Yuan , Yinyu Jin , Yongli Wang
{"title":"Scholar's Career Switch from Academia to Industry: Mining and Analysis from AMiner","authors":"Zhou Shao , Sha Yuan , Yinyu Jin , Yongli Wang","doi":"10.1016/j.bdr.2024.100441","DOIUrl":"10.1016/j.bdr.2024.100441","url":null,"abstract":"<div><p>The phenomenon of scholars switching their careers from academia to industry has become more prevalent nowadays. This paper proposes a combination approach of bibliometrics analysis and data mining to study the phenomenon from the perspective of Science of Science (SciSci). Based on the proposed methods, this paper first provides an overview of frequent companies and frequent universities as well as the exponentially increasing number of scholars under the scenario. And then, this study uncovers the excessively single patterns in South Korean scholars switches using frequent pattern mining from their papers. This paper studies the knowledge and technology transfer (KTT) and the research change of scholars by using the language model, the result of which illustrates that the research difference between industry and academia gradually decreases and reaches a steady state in recent years. In exploring the driving factors of the phenomenon, deep preliminary cooperation may be an essential reason, and the career switches will not promote the published amounts of papers but may benefit its academic influence. This study should, therefore, be of value to researchers wishing to study the academia-industry career switches more intensely.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100441"},"PeriodicalIF":3.3,"publicationDate":"2024-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139922786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-02-14DOI: 10.1016/j.bdr.2024.100445
David Auber , Nikos Bikakis , Panos K. Chrysanthis , George Papastefanatos , Mohamed Sharaf
{"title":"Interactive big data visualization and analytics","authors":"David Auber , Nikos Bikakis , Panos K. Chrysanthis , George Papastefanatos , Mohamed Sharaf","doi":"10.1016/j.bdr.2024.100445","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100445","url":null,"abstract":"","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100445"},"PeriodicalIF":3.3,"publicationDate":"2024-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139748262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-02-13DOI: 10.1016/j.bdr.2024.100444
Bo Jiang , Hao Wang , Hanxu Ma
{"title":"A big data driven vegetation disease and pest region identification method based on self supervised convolutional neural networks and parallel extreme learning machines","authors":"Bo Jiang , Hao Wang , Hanxu Ma","doi":"10.1016/j.bdr.2024.100444","DOIUrl":"10.1016/j.bdr.2024.100444","url":null,"abstract":"<div><p>A self supervised convolutional neural network-parallel extreme learning machine classification model based on big data is proposed to address the subjectivity and inaccuracy of traditional methods for identifying vegetation pests and diseases that rely on manual observation and empirical judgment. This model is constructed using convolutional neural networks and parallel extreme learning machines, and integrates feature extraction networks with dual attention mechanisms to improve the accuracy of identifying pests and diseases. The model utilized a large amount of big data for training, achieving a recall rate of 98.42 % on multispectral datasets, and an overall classification accuracy of 99.04 %. After optimizing the residual network, the overall accuracy of identifying vegetation pest and disease areas has been further improved to 99.77 %, and the recall rate has also reached 98.91 %. These results indicate that the method proposed in this study has high accuracy and efficiency in the application of big data, can meet the needs of disease and pest identification, and provides effective technical support for the monitoring and prevention of crop diseases and pests, which has important practical significance.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100444"},"PeriodicalIF":3.3,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139887525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data ResearchPub Date : 2024-02-12DOI: 10.1016/j.bdr.2024.100438
Shuoxi Zhang , Hanpeng Liu , Kun He
{"title":"Knowledge Distillation via Token-Level Relationship Graph Based on the Big Data Technologies","authors":"Shuoxi Zhang , Hanpeng Liu , Kun He","doi":"10.1016/j.bdr.2024.100438","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100438","url":null,"abstract":"<div><p>In the big data era, characterized by vast volumes of complex data, the efficiency of machine learning models is of utmost importance, particularly in the context of intelligent agriculture. Knowledge distillation (KD), a technique aimed at both model compression and performance enhancement, serves as a pivotal solution by distilling the knowledge from an elaborate model (teacher) to a lightweight, compact counterpart (student). However, the true potential of KD has not been fully explored. Existing approaches primarily focus on transferring instance-level information by big data technologies, overlooking the valuable information embedded in token-level relationships, which may be particularly affected by the long-tail effects. To address the above limitations, we propose a novel method called Knowledge Distillation with Token-level Relationship Graph (TRG) that leverages token-wise relationships to enhance the performance of knowledge distillation. By employing TRG, the student model can effectively emulate higher-level semantic information from the teacher model, resulting in improved performance and mobile-friendly efficiency. To further enhance the learning process, we introduce a dynamic temperature adjustment strategy, which encourages the student model to capture the topology structure of the teacher model more effectively. We conduct experiments to evaluate the effectiveness of the proposed method against several state-of-the-art approaches. Empirical results demonstrate the superiority of TRG across various visual tasks, including those involving imbalanced data. Our method consistently outperforms the existing baselines, establishing a new state-of-the-art performance in the field of KD based on big data technologies.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"36 ","pages":"Article 100438"},"PeriodicalIF":3.3,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139737402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}