Xing Wei, Xiufen Ye, Xinkui Mei, Junting Wang, Heming Ma
{"title":"A single image deraining algorithm guided by text generation based on depth information conditions","authors":"Xing Wei, Xiufen Ye, Xinkui Mei, Junting Wang, Heming Ma","doi":"10.1016/j.asoc.2025.113506","DOIUrl":"10.1016/j.asoc.2025.113506","url":null,"abstract":"<div><div>Currently, image denoising algorithms based on text-to-image diffusion models often encounter issues with disordered internal structure layouts and discrepancies in detail when generating high-resolution images. To address these issues, we proposed a single image deraining algorithm guided by text generation based on depth information conditions. We designed a depth information encoder aimed at leveraging the depth information in rainy images to enhance the spatial mapping between text-to-image and image-to-text, thereby improving the internal structural layout of the generated images. To make the texture details of the generated image domain more similar to those of the original image domain, we designed a Cross Attention module that uses difference information to make the images in both domains more similar, thereby enhancing the guidance of existing deraining algorithms. Experimental results on multiple benchmark datasets demonstrate that the proposed algorithm outperforms state-of-the-art image deraining methods in both visual quality and quantitative performance. On average, it achieves an improvement of 0.46 in SSIM and 0.79 dB in PSNR, effectively removing rain streaks while preserving fine image details and maintaining structural consistency. We will release our code on <span><span>Github</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113506"},"PeriodicalIF":7.2,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144557211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mining variable-length anomalies in road network time series: A two-stage optimization framework","authors":"Hendri Sutrisno , Frederick Kin Hing Phoa","doi":"10.1016/j.asoc.2025.113516","DOIUrl":"10.1016/j.asoc.2025.113516","url":null,"abstract":"<div><div>Detecting variable-length anomalous subsequences in network traffic is challenging due to the absence of fixed temporal patterns. Anomalies may begin at any point, last for unpredictable durations, and exhibit diverse behaviors depending on the context. Without prior knowledge of where or how long an anomaly may occur, any motif in the time series could be considered anomalous. This uncertainty increases the search complexity, as the method must explore many possible subsequences with different lengths and timings. Since labeled anomalies are often unavailable, the problem is framed as an unsupervised discovery task. It also means the methods do the search and validate anomalies without prior training. This issue makes the problem not only computationally challenging but also conceptually difficult. Existing methods often struggle because they rely on exhaustive searches that require heavy computation. Moreover, when spatial–temporal dynamics are considered, such as in road network traffic where anomalies can propagate across different locations with variable delays, the problem becomes even more complex, as the detection method must account for both when and where anomalies occur. To address these challenges, we propose a two–stage optimization framework called <span><math><mrow><mi>M</mi><msub><mrow><mi>P</mi></mrow><mrow><mi>O</mi><mi>P</mi><mi>T</mi></mrow></msub></mrow></math></span>. In the first stage, the matrix profile is applied to signal potential anomaly locations. In the second stage, a metaheuristic optimizer refines the starting point and length of each detected signal. During refinement, Latin hypercube sampling is used to reduce the number of comparisons between candidate signals and neighboring patterns without sacrificing generalization. We validate the proposed framework using network traffic flow data from Taiwan’s freeway system. Experimental results show that <span><math><mrow><mi>M</mi><msub><mrow><mi>P</mi></mrow><mrow><mi>O</mi><mi>P</mi><mi>T</mi></mrow></msub></mrow></math></span> is at least 26 times faster than benchmarking methods while achieving up to 28.5% higher search accuracy, measured based on relative anomaly scores. These results demonstrate the practical applicability and efficiency of our work for detecting complex anomalies in network time series.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113516"},"PeriodicalIF":7.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144536248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Disentangled reflectance-ambient feature learning for day-night vehicle re-identification","authors":"Tae-Moon Seo, Dong-Joong Kang","doi":"10.1016/j.asoc.2025.113539","DOIUrl":"10.1016/j.asoc.2025.113539","url":null,"abstract":"<div><div>Vehicle re-identification across different time domains is a critical task in intelligent surveillance systems, aiming to match the same vehicle across multiple non-overlapping cameras under varying lighting conditions. Existing methods often struggle to handle the domain discrepancy between daytime and nighttime images, mainly due to lighting variation and glare. To address this, a novel framework named Reflectance-Ambient Feature Learning (RAFL) is proposed, which disentangles structural reflectance features from ambient lighting effects using offline reflectance decomposition. By integrating separated batch normalization and a domain alleviation module, the framework effectively minimizes the domain gap while preserving identity-discriminative features. Experimental results on benchmark datasets demonstrate that the proposed method achieves state-of-the-art performance, with up to 4.0 % improvement in Rank-1 accuracy and over 1.5 % gain in mean Average Precision compared to existing methods. This highlights the effectiveness of feature disentanglement for robust cross-domain vehicle re-identification in real-world surveillance.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113539"},"PeriodicalIF":7.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144557213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sizhe Liu , Yongsheng Qi , Dongze Li , Liqiang Liu , Shunli Wang , Carlos Fernandez , Xuejin Gao
{"title":"Adversarial multi-source domain generalization approach for power prediction in unknown photovoltaic systems","authors":"Sizhe Liu , Yongsheng Qi , Dongze Li , Liqiang Liu , Shunli Wang , Carlos Fernandez , Xuejin Gao","doi":"10.1016/j.asoc.2025.113495","DOIUrl":"10.1016/j.asoc.2025.113495","url":null,"abstract":"<div><div>Accurate forecasting of power output for previously unseen photovoltaic installations is of critical importance to the reliability and efficiency of renewable-energy management systems. Existing data-driven PV prediction techniques rely primarily on historical measurements from familiar systems, which constrains their applicability to new sites without prior observations. To address this limitation, we introduce a Generative Adversarial Domain enhanced Prediction Network (GADPN). GADPN employs an adversarial generator to synthesize diverse pseudo domain samples that mitigate distributional discrepancies between source and target domains. Through an alternating optimization regime, the framework enforces both semantic consistency and manifold regularization constraints to align synthesized and empirical feature representations, while a Transformer-based predictor captures local and global temporal dynamics. We evaluate the proposed approach on nine geographically and capacity-diverse PV systems (ranging from 2.16 kW to 45.78 kW) under a zero-sample setting. Experimental results demonstrate that GADPN achieves coefficients of determination exceeding 0.97 in eight of the nine cases and attains a peak coefficient of determination of 0.9993, outperforming state-of-the-art baselines. These findings confirm GADPN’s effectiveness for robust, zero-sample generalization in PV power forecasting.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113495"},"PeriodicalIF":7.2,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144522342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zongliang Xie , Kaiyu Zhang , Jinglong Chen , Chi-Guhn Lee , Shuilong He
{"title":"A distributed training method with intergenerational accumulation and cross-node random drop for mechanical fault diagnosis","authors":"Zongliang Xie , Kaiyu Zhang , Jinglong Chen , Chi-Guhn Lee , Shuilong He","doi":"10.1016/j.asoc.2025.113532","DOIUrl":"10.1016/j.asoc.2025.113532","url":null,"abstract":"<div><div>With increasing complexity of deep neural networks and continuous expansion of training datasets, the computational cost of model training grows exponentially. To reduce training time, distributed training systems leveraging multiple computing devices have been developed for computational acceleration. However, compared with the rapidly increasing computing power, the communication bandwidth between devices increases slowly and becomes a bottleneck restricting the efficiency of distributed training. In this paper, an efficient distributed training method called gradient transfer compression (GTC) is proposed to reduce communication overhead and improve training efficiency. The methodology involves three key techniques: (1) Intergenerational accumulation, where gradients generated over multiple iterations are stored and accumulated, reducing the frequency of communication between computing devices; (2) Cross-node random drop, which synchronizes gradients with a specified ratio to decrease network traffic while ensuring model convergence; and (3) Mixed precision training, which reduces the bandwidth required for gradient communication. The effectiveness of GTC is demonstrated through experiments on two rolling bearing datasets. Compared with the conventional PyTorch distributed training method, the proposed method reduces the GPU memory usage by 97.10 % and 14.02 %, increases the training efficiency by 24.74 % and 8.03 % respectively in two cases, while maintaining the diagnostic performance of the model.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113532"},"PeriodicalIF":7.2,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144536163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Single image and video generation using a receptive diffusion model with convolutional spatiotemporal blocks","authors":"Yingli Hou , Wei Zhang , Zhiliang Zhu , Hai Yu","doi":"10.1016/j.asoc.2025.113509","DOIUrl":"10.1016/j.asoc.2025.113509","url":null,"abstract":"<div><div>The generation of images from a single natural image/video has garnered significant attention due to its broad applications. However, existing methods training on a single input image or video face two key limitations. First, GAN-based approaches, relying on multiple models trained at progressively increasing scales, often lead to error accumulation and artifacts in generated results. Second, while diffusion models offer superior quality and diversity, they require extensive training time for a single input and are limited to generation tasks without the ability to edit existing images or videos. To address these challenges, we propose a <strong><u>Uni</u></strong>fied Diffusi<strong><u>on</u></strong> Model for Single Image/Video Training, named Union, achieving a balanced trade-off between computational efficiency and visual quality. Specifically, we introduce: (1) a unified model trained at a single scale, avoiding the error accumulation seen in multi-scale models; and (2) a novel Receptive DDPM framework with convolutional spatiotemporal blocks (CS-Block) that learns patch distribution of a natural image rather than simple image replication. The CS-Block uses ConvNext and spatiotemporal attention mechanisms to capture local and global relationships in temporal and frequency domains, enabling efficient adaptation to the patch-level receptive field of natural images and videos. Extensive experiments across image and video tasks demonstrate that Union outperforms other methods, achieving the best LPIPS score on the public Places50 dataset and excelling in high-resolution video generation, providing an optimal balance between computational cost and performance. The training and generated images/videos are available at: <span><span>https://github.com/hylneu/union.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113509"},"PeriodicalIF":7.2,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144522117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoya Lu , Yifan Liu , Fan Feng , Yi Liu , Zhenpeng Liu
{"title":"Mask-based Self-supervised Network Intrusion Detection System","authors":"Xiaoya Lu , Yifan Liu , Fan Feng , Yi Liu , Zhenpeng Liu","doi":"10.1016/j.asoc.2025.113498","DOIUrl":"10.1016/j.asoc.2025.113498","url":null,"abstract":"<div><div>A significant number of network intrusion detection systems utilize unsupervised anomaly detection methodologies, the majority of which fail to account for the potential for contamination in the data, resulting in suboptimal detection outcomes.This paper proposes an unsupervised method, designated MS-IDS (Mask-based Self-supervised Network Intrusion Detection System), which employs the techniques of mask shielding and Stacked Sparse Autoencoder (SSAE).MS-IDS is trained on data that has been contaminated in some way, generating a variety of masks through the process of learning. These masked inputs are subsequently reconstructed by SSAE. A composite loss function is devised, encompassing losses from both the mask unit and the SSAE. During the training phase, the combined loss function is optimized with the objective of identifying the optimal parameters and transformations for the SSAE. In the testing phase, the loss function assigns a score to each sample, which is used to classify outliers based on their scores. The performance of MS-IDS was evaluated across four intrusion datasets: The datasets used for evaluation were NSL-KDD, CIC-IDS2017, ToN-IoT, and CIC-DDOS2019. The results demonstrate that even when varying levels of contamination are introduced into the benign traffic, MS-IDS maintains robust performance with minimal decline. Notably, MS-IDS outperforms other models in terms of accuracy, AUC-ROC, and F1 scores, and its ability to detect attacks in contaminated data undergoes significant enhancement.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113498"},"PeriodicalIF":7.2,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144522118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interval-valued inclusion–exclusion integral to aggregate interval-valued data based on multiple admissible orders","authors":"Si Xu Zhu, Bo Wen Fang","doi":"10.1016/j.asoc.2025.113501","DOIUrl":"10.1016/j.asoc.2025.113501","url":null,"abstract":"<div><div>As a nonlinear fuzzy aggregation function, the interval-valued Choquet integral is widely used in decision analysis, rule-based classification, and information fusion. However, its linear order between intervals is usually provided via human intervention, and different settings significantly affect calculation results. As an extended form of the Choquet integral, the Inclusion–Exclusion (IE) integral does not require sorting input variables, a property that demonstrates remarkable potential in information fusion. Nevertheless, this concept has not been extended to the interval-valued domain. Therefore, an interesting challenge arises: how to utilize the IE integral’s feature of not requiring input sorting to address the sensitivity of interval-valued Choquet integrals to order relation selection. Aiming at this sensitivity, this study proposes an interval-valued IE integral and constructs a novel aggregation model to mitigate order dependency. The research includes: (1) defining the interval-valued IE integral based on IE integral theory and analyzing its properties; (2) constructing an aggregation model by integrating interval-valued IE and Choquet integrals; (3) deriving gradient formulas for model parameters and providing the model’s computational algorithm; (4) verifying the effectiveness of the proposed method through numerical experiments and benchmark public datasets. The results provide a new methodology for addressing the order relation sensitivity of fuzzy integrals and expand the theoretical application scope of IE integrals.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113501"},"PeriodicalIF":7.2,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144522345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liang Li , Guochu Chen , Haiyan Wang , Baojiang Li , Bin Wang , Zizhen Yi , Chunbo Zhao
{"title":"VHTformer: A joint query perception method for visual-haptic-textual information based on Transformer","authors":"Liang Li , Guochu Chen , Haiyan Wang , Baojiang Li , Bin Wang , Zizhen Yi , Chunbo Zhao","doi":"10.1016/j.asoc.2025.113529","DOIUrl":"10.1016/j.asoc.2025.113529","url":null,"abstract":"<div><div>Multimodal information fusion research struggles with aligning heterogeneous modalities and addressing data imbalance, especially when integrating visual, haptic, and text—three modalities offering complementary perceptual and semantic features. Current research focuses on Transformers for unimodal and vision-haptics bimodal tasks, neglecting tri-modal integration. Leveraging text's semantic bridging capacity could address this limitation in cross-sensory learning. We propose VHTformer, a Transformer-based framework designed to unify visual, haptic, and textual modalities via joint query learning. The model leverages hierarchical attention mechanisms: self-attention refines intra-modal features (e.g., extracting texture from haptic signals or contextual semantics from text). Meanwhile, cross-attention aligns spatial-semantic patterns across modalities through learnable joint queries. This enables synergistic fusion of geometric shapes (vision), material properties (haptics), and descriptive attributes (text). Experiments were conducted on three multimodal datasets—ObjectFolder 2.0, Touch and Go, and ObjectFolder Real—covering a total of 100 + object categories with diverse material and shape properties. To mitigate class imbalance and ensure statistical reliability, we adopted stratified 5-fold cross-validation. In addition, we conducted robustness evaluations under Gaussian noise injection to verify the model's robustness. VHTformer achieves up to 99.55 % recognition accuracy and demonstrates strong robustness, highlighting the value of tri-modal integration for comprehensive object understanding.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113529"},"PeriodicalIF":7.2,"publicationDate":"2025-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144549101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A transformer-based self-supervised pre-training model for time series prediction","authors":"Zhengrong Sun, Junhai Zhai, Yang Cao, Feng Zhang","doi":"10.1016/j.asoc.2025.113491","DOIUrl":"10.1016/j.asoc.2025.113491","url":null,"abstract":"<div><div>Multivariate time series forecasting is ubiquitous in the real world. The performance of prediction model is determined by its representation ability. At present, self-supervised pre-training is the main method to improve the representation ability of prediction models. However, the periodic characteristics of time series are rarely considered in the existing pre-training models. Our experimental study shows that the periodic characteristics of time series have a great impact on the performance of self-supervised pre-training models. To address this issue, we propose a novel self-supervised prediction model, SMformer. SMformer has two distinctive features: (1) A new patch partition Module is innovatively introduced into backbone model transformer using the periodic property of time series. (2) Two pretext tasks, shuffle and mask, are design for the self-supervised pre-training of the model SMformer. We conducted extensive experiments on seven benchmark datasets, and the experimental results demonstrate that SMformer significantly outperforms prior comparison baselines.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113491"},"PeriodicalIF":7.2,"publicationDate":"2025-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144513777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}