NeurocomputingPub Date : 2024-11-21DOI: 10.1016/j.neucom.2024.128973
Xingxia Wang , Yuhang Liu , Xiang Cheng , Yutong Wang , Yonglin Tian , Fei-Yue Wang
{"title":"ParaDC: Parallel-learning-based dynamometer cards augmentation with diffusion models in sucker rod pump systems","authors":"Xingxia Wang , Yuhang Liu , Xiang Cheng , Yutong Wang , Yonglin Tian , Fei-Yue Wang","doi":"10.1016/j.neucom.2024.128973","DOIUrl":"10.1016/j.neucom.2024.128973","url":null,"abstract":"<div><div>The accurate fault diagnosis of sucker rod pump systems (SRPs) is crucial for the sustainable development of oil & gas. Currently, dynamometer cards (DCs) are widely employed to evaluate the working condition of SRPs, framing fault diagnosis as a pattern recognition problem. While significant attention has been dedicated to enhancing the performance of diagnostic algorithms, the critical role of high-quality DC datasets in improving diagnostic accuracy has been comparatively underexplored. To address issues of incomplete and imbalanced data distribution in existing DC datasets, this paper introduces ParaDC, a novel data augmentation mechanism grounded in parallel learning, to facilitate the transition of DCs from “small data” to “big data” and ultimately realize “deep intelligence”. Under this mechanism, wave equations representing the behavior of SRPs are first utilized to construct the customized “small” DC datasets. Diffusion models are then incorporated to augment the “big” DC datasets and enhance data diversity. Additionally, iterative training combined with human feedback is introduced to optimize and improve the quality of generated DCs, accelerating the pathway towards “deep intelligence”. To further validate the feasibility and effectiveness of ParaDC, extensive computational experiments are conducted to demonstrate its outstanding generative performance. Finally, the potential of intelligent diagnostic systems, supported by digital workers, is discussed in the context of Industry 5.0, which is believed to be indispensable in the future industrial diagnostic paradigm.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128973"},"PeriodicalIF":5.5,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-21DOI: 10.1016/j.neucom.2024.128935
Yueneng Wang, Zhongjie Mi, Xinghao Jiang, Tanfeng Sun
{"title":"Detection of video transcoding from AVC to HEVC based on Intra Prediction Feature Maps","authors":"Yueneng Wang, Zhongjie Mi, Xinghao Jiang, Tanfeng Sun","doi":"10.1016/j.neucom.2024.128935","DOIUrl":"10.1016/j.neucom.2024.128935","url":null,"abstract":"<div><div>As the High Efficiency Video Coding (HEVC) standard gains popularity, forgers are more inclined to transcode videos into HEVC format from the previous Advanced Video Coding (AVC) format. To verify the originality and authenticity of videos, it is crucial to propose a method for transcoded HEVC video detection. In this paper, a novel method is proposed to detect video transcoding from AVC to HEVC (AVC-HEVC). Analysis shows that the intra prediction mode is sensitive to spatial loss introduced by previous AVC encoding in the transcoding process. Thus, the intra prediction modes are extracted from the luminance and chrominance components to create Intra Prediction Feature Maps (IPFMs). Subsequently, a Dual-flow Attention-based MobileNet (DAM-Net) is introduced to learn the deep representation of AVC-HEVC transcoding artifacts. Finally, video level results are derived from the frame level analysis provided by DAM-Net. Extensive experiment results demonstrate that the performance of the proposed method outperforms the existing methods in the detection of AVC-HEVC transcoding.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128935"},"PeriodicalIF":5.5,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-21DOI: 10.1016/j.neucom.2024.128944
Lijun Gou , Jinrong Yang , Hangcheng Yu , Pan Wang , Xiaoping Li , Tuo Shi
{"title":"A semantic consistent object detection model for domain adaptation based on mixed-class distribution metrics","authors":"Lijun Gou , Jinrong Yang , Hangcheng Yu , Pan Wang , Xiaoping Li , Tuo Shi","doi":"10.1016/j.neucom.2024.128944","DOIUrl":"10.1016/j.neucom.2024.128944","url":null,"abstract":"<div><div>Unsupervised domain adaptation is crucial for mitigating the performance degradation caused by domain bias in object detection tasks. In previous studies, the focus has been on pixel-level and instance-level shift alignment to minimize domain discrepancy. However, it is important to note that this method may inadvertently align single-class instance features with mixed-class instance features that belong to multiple categories within the same image during image-level domain adaptation. This challenge arises because each image in object detection tasks contains objects of multiple categories. To achieve the same category feature alignment between single-class and mixed-class, our method considers features with different mixed categories as a new class and proposes a mixed-classes <span><math><mi>H</mi></math></span>-divergence to reduce domain bias for object detection. To enhance both single-class and mixed-class semantic information, and to achieve semantic separation for the mixed-classes in <span><math><mi>H</mi></math></span>-divergence, we employ Semantic Prediction Models (SPM) and Semantic Bridging Components (SBC). Furthermore, we reweigh the loss of the pixel domain discriminator based on the SPM results to reduce sample imbalance. Our extensive experiments on widely used datasets illustrate how our method can robustly improve object detection in domain bias settings.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128944"},"PeriodicalIF":5.5,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-21DOI: 10.1016/j.neucom.2024.128972
Bo Gao , Michael W. Spratling
{"title":"Filter competition results in more robust Convolutional Neural Networks","authors":"Bo Gao , Michael W. Spratling","doi":"10.1016/j.neucom.2024.128972","DOIUrl":"10.1016/j.neucom.2024.128972","url":null,"abstract":"<div><div>Convolutional layers, one of the basic building blocks of deep learning architectures, contain numerous trainable filters for feature extraction. These filters operate independently which can result in distinct filters learning similar weights and extracting similar features. In contrast, competition mechanisms in the brain contribute to the sharpening of the responses of activated neurons, enhancing the contrast and selectivity of individual neurons towards specific stimuli, and simultaneously increasing the diversity of responses across the population of neurons. Inspired by this observation, this paper proposes a novel convolutional layer based on the theory of predictive coding, in which each filter effectively tries to block other filters from responding to the input features which it represents. In this way, filters learn to become more distinct which increases the diversity of the extracted features. When replacing standard convolutional layers with the proposed layers the performance of classification networks is not only improved on ImageNet but also significantly boosted on eight robustness benchmarks, as well as on downstream detection and segmentation tasks. Most notably, ResNet50/101/152 robust accuracy increases by 15.9%/20.0%/20.9% under FGSM attack, and by 10.5%/14.7%/15.0% under PGD attack.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128972"},"PeriodicalIF":5.5,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-21DOI: 10.1016/j.neucom.2024.128865
Omid Hajipoor, Ahmad Nickabadi, Mohammad Mehdi Homayounpour
{"title":"GPTGAN: Utilizing the GPT language model and GAN to enhance adversarial text generation","authors":"Omid Hajipoor, Ahmad Nickabadi, Mohammad Mehdi Homayounpour","doi":"10.1016/j.neucom.2024.128865","DOIUrl":"10.1016/j.neucom.2024.128865","url":null,"abstract":"<div><div>Training generative models that can generate high-quality and diverse text remains a significant challenge in the field of natural language generation (NLG). Recently, the emergence of large language models (LLMs) like GPT has enabled the generation of text with remarkable quality and diversity. However, building these models from scratch is both time-consuming and resource-intensive, making their comprehensive training practically unfeasible. Nonetheless, LLMs utility extends to addressing issues in other models. For instance, generative adversarial models often grapple with the well-known problem of mode collapse during training, leading to a trade-off between text quality and diversity. This means that these models tend to favor quality over diversity. In this study, we introduce a novel approach designed to enhance adversarial text generation by striking a balance between the quality and diversity of generated text, leveraging the capabilities of the GPT language model and other LLMs. To achieve this, we propose an enhanced generator that is guided by the GPT model. Essentially, the GPT model functions as a mentor to the generator, influencing its outputs. To achieve this guidance, we employ discriminators of varying scales on both real data and the texts generated by GPT. Experimental results underscore a substantial enhancement in the quality and diversity of outcomes across two benchmark datasets. Also the results demonstrate the generator’s ability to assimilate the output domain of the GPT language model. Furthermore, the proposed model exhibits superior performance in human evaluations when compared to other existing adversarial methods.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128865"},"PeriodicalIF":5.5,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-21DOI: 10.1016/j.neucom.2024.128970
Yan Zhang, Xudong Zhou, Nian Wang, Jun Tang, Tao Xuan
{"title":"DouN-GNN:Double nodes graph neural network for few-shot learning","authors":"Yan Zhang, Xudong Zhou, Nian Wang, Jun Tang, Tao Xuan","doi":"10.1016/j.neucom.2024.128970","DOIUrl":"10.1016/j.neucom.2024.128970","url":null,"abstract":"<div><div>In recent years, graph neural networks (GNNs) for few-shot learning have garnered significant attention due to their powerful learning capabilities. However, previous methods typically construct nodes using single-modal samples, often overlooking additional information (e.g., high-frequency details information) that can be provided by other modalities, which may limit model performance. To fully leverage multi-dimensional information from various sample modalities, we propose a novel double-node graph neural network (DouN-GNN). In our approach, each node comprises two sub-nodes, with each sub-node representing a different modality of the sample image. To address the issue of information redundancy between modalities when constructing sub-nodes, we introduce an orthogonal transformation method to orthogonalize the sub-node features. Additionally, we develop a graph update module for double-nodes, which alternately updates the nodes and edges of the graph to facilitate the aggregation of multi-dimensional information from multi-modal images. As the number of graph update layers increases, the edge features become more reliable, further enhancing performance. Extensive experiments on the miniImageNet, TieredImageNet, and CUB-200-2011 datasets demonstrate that our method outperforms existing state-of-the-art approaches.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128970"},"PeriodicalIF":5.5,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-21DOI: 10.1016/j.neucom.2024.128942
Shan Chen , Yuqing Ni , Lingying Huang , Xiaoli Luan , Fei Liu
{"title":"Clustering-based detection algorithm of remote state estimation under stealthy innovation-based attacks with historical data","authors":"Shan Chen , Yuqing Ni , Lingying Huang , Xiaoli Luan , Fei Liu","doi":"10.1016/j.neucom.2024.128942","DOIUrl":"10.1016/j.neucom.2024.128942","url":null,"abstract":"<div><div>This paper investigates a security issue in cyber–physical systems (CPSs) concerning the performance of a multi-sensor remote state estimation under a novel attack called “Optimal Stealthy Innovation-Based Attacks with Historical Data”. The attacker is able to launch a linear attack to modify sensor measurements. The objective of the attacker is to maximize the deterioration of estimation performance while ensuring they remain undetected by the <span><math><msup><mrow><mi>χ</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> detector. To counteract this new type of attack, a remote state estimator equipped with a detection mechanism that utilizes a Gaussian mixture model (GMM) is employed. We derive the error covariances for the remote state estimator with and without a GMM detection mechanism in a recursive manner under Optimal Stealthy Innovation-Based Attacks with Historical Data. The experimental results demonstrate the superiority of the GMM detection mechanism. However, it is observed that the estimation performance of the GMM-based system deteriorates as the system dimension increases. In order to address this issue, we propose two dimensionality reduction methods, namely kernel principal component analysis (KPCA) and variational autoencoder (VAE), to enhance the estimation performance. Finally, the results are illustrated via the simulation examples.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128942"},"PeriodicalIF":5.5,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-20DOI: 10.1016/j.neucom.2024.128957
Shixian Shen , Yong Feng , Nianbo Liu , Ming Liu , Yingna Li
{"title":"DSAFuse: Infrared and visible image fusion via dual-branch spatial adaptive feature extraction","authors":"Shixian Shen , Yong Feng , Nianbo Liu , Ming Liu , Yingna Li","doi":"10.1016/j.neucom.2024.128957","DOIUrl":"10.1016/j.neucom.2024.128957","url":null,"abstract":"<div><div>By exploiting the thermal radiation information from infrared images and the detailed texture information from visible light images, image fusion technology enables more accurate target identification. However, most current image fusion methods primarily rely on convolutional neural networks for cross-modal local feature extraction and do not fully utilize long-range contextual information, resulting in limited performance in complex scenarios. To address this issue, this paper proposes an infrared and visible light image fusion method termed DSAFuse, which is based on dual-branch spatially adaptive feature extraction. Specifically, a unimodal feature mixing module is used for multi-scale spatially adaptive feature extraction on both modal images with shared weights. The extracted features are then inputted into a dual-branch feature extraction module comprising flatten transformer blocks and vanilla blocks, which extract low-frequency texture features and high-frequency local detail features, respectively. Subsequently, features from both modalities are concatenated, and a bimodal feature mixing module reconstructs the fused image to generate semantically rich fusion results. Additionally, to achieve end-to-end unsupervised training, a loss function consisting of decomposition loss, gradient loss, and structural similarity loss is designed. Qualitative and quantitative experimental results demonstrate that our DSAFuse outperforms the state-of-the-art IVIF methods across various benchmark datasets. It effectively preserves the texture details and target features of the source images, producing satisfactory fusion results even in harsh environments and enhancing downstream visual tasks.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128957"},"PeriodicalIF":5.5,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-20DOI: 10.1016/j.neucom.2024.128913
Xueer Zhang , Jing Wang , Yunfei Bai , Lu Zhang , Youfang Lin
{"title":"TF4TF: Multi-semantic modeling within the time–frequency domain for long-term time-series forecasting","authors":"Xueer Zhang , Jing Wang , Yunfei Bai , Lu Zhang , Youfang Lin","doi":"10.1016/j.neucom.2024.128913","DOIUrl":"10.1016/j.neucom.2024.128913","url":null,"abstract":"<div><div>Long-term Time Series Forecasting (LTSF) plays a crucial role in real-world applications for early warning and decision-making. Time series inherently embody complex semantic information, including segment semantics, global–local semantics, and multi-view semantics, the thorough mining of which can significantly enhance the accuracy. Previous works have not been able to simultaneously address all of the semantic information mentioned above. Meanwhile, the thorough mining of semantic information introduces additional computational complexity, resulting in inefficiency issues for existing multi-semantic information mining methods. Considering the aforementioned situation, we propose a multi-semantic method within the Time–Frequency domain For long-term Time-series Forecasting (TF4TF), which can balance complex semantic information mining and efficiency. For sequences with segment semantics following patching process, mining is conducted from both time and frequency domain perspectives to extract Multi-View Semantics. Within this framework, Progressive Local Windows (PLW) blocks and Global Frequency Filtering (GFF) blocks are specifically designed, which achieve efficient mining of multi-scale information while maintaining lower complexity. Ultimately, forecasting is achieved by integrating the semantic information outlined above. Our proposed method, TF4TF, has achieved state-of-the-art (SOTA) results on seven real-world time series forecasting datasets.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128913"},"PeriodicalIF":5.5,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph convolutional network for fast video summarization in compressed domain","authors":"Chia-Hung Yeh , Chih-Ming Lien , Zhi-Xiang Zhan , Feng-Hsu Tsai , Mei-Juan Chen","doi":"10.1016/j.neucom.2024.128945","DOIUrl":"10.1016/j.neucom.2024.128945","url":null,"abstract":"<div><div>Video summarization is the process of generating a concise and representative summary of a video by selecting its most important frames. It plays a vital role in the video streaming industry, allowing users to quickly understand the overall content of a video without watching it in its entirety. Most existing video summarization methods require fully decoding the video stream and extracting the features with a pre-trained deep learning model in the pixel domain, which is time-consuming and computationally expensive. To address this issue, this paper proposes a novel method called Graph Convolutional Network-based Compressed-domain Video Summarization (GCNCVS), which directly exploits the compressed-domain information and leverages graph convolutional network to learn temporal relationships between frames, thereby enhancing its ability to capture contextual and valuable information when generating summarized videos. To evaluate the performance of GCNCVS, we conduct experiments on two benchmark datasets, SumMe and TVSum. Experimental results demonstrate that our method outperforms existing methods, achieving an average F-score of 53.5% on the SumMe dataset and 72.3% on the TVSum dataset. Additionally, the proposed method shows Kendall's τ correlation coefficient of 0.157 and Spearman's ρ correlation coefficient of 0.205 on the TVSum dataset. Our method also significantly reduces computational time, which enhances the feasibility of video summarization in video streaming environments.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128945"},"PeriodicalIF":5.5,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142759435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}