{"title":"A robust stochastic quasi-Newton algorithm for non-convex machine learning","authors":"Hanger Liu, Yuqing Liang, Jinlan Liu, Dongpo Xu","doi":"10.1007/s10489-025-06475-5","DOIUrl":"10.1007/s10489-025-06475-5","url":null,"abstract":"<div><p>Stochastic quasi-Newton methods have garnered considerable attention within large-scale machine learning optimization. Nevertheless, the presence of a stochastic gradient equaling zero poses a significant obstacle to updating the quasi-Newton matrix, thereby impacting the stability of the quasi-Newton algorithm. To address this issue, a checkpoint mechanism is introduced, i.e., checking the value of <span>(textbf{s}_k)</span> before updating the quasi-Newton matrix, which effectively prevents zero increments in the optimization variable and enhances algorithmic stability during iterations. Meanwhile, a novel gradient incremental formulation is introduced to satisfy curvature conditions, facilitating convergence for non-convex objectives. Additionally, finite-memory techniques are employed to reduce storage requirements in large-scale machine learning tasks. The last iteration of the proposed algorithm is proven to converge in a non-convex setting, which is better than average and minimum iteration convergence. Finally, experiments are conducted on benchmark datasets to compare the proposed RSLBFGS algorithm with other popular first and second-order methods, demonstrating the effectiveness and robustness of RSLBFGS.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143688587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fengling Zhou, Zhixin Li, Canlong Zhang, Huifang Ma
{"title":"Hierarchical-enhanced graph convolutional networks leveraging causal inference for aspect-based sentiment analysis","authors":"Fengling Zhou, Zhixin Li, Canlong Zhang, Huifang Ma","doi":"10.1007/s10489-025-06465-7","DOIUrl":"10.1007/s10489-025-06465-7","url":null,"abstract":"<div><p>Aspect-based sentiment analysis (ABSA) aims to determine the sentiment polarity of a particular aspect in a sentence. Existing research focuses on shortening the distance between opinion words and aspect words, resulting in spurious correlations. At the same time, the use of different dependent tools will bring different types of noise, destroying the effectiveness of the model. To address these issues, we propose a causal model of hierarchically augmented graph convolutional networks (CausalGCN). Specifically, we subdivide the language features into four relationships and then construct their corresponding mask matrices based on different relationships. At the same time, we introduce an instrumental variable to eliminate the confounders generated by the tool. Our model then combines the resulting mask matrix with localized attention at multiple levels. We treat the relationships between words and adjacent tensors as nodes and edges respectively, resulting in a multi-channel graph. Finally, we utilize graph convolutional networks to enhance relationship-aware node representations. Experimental results on three benchmark datasets demonstrate the effectiveness of the proposed model.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143688586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liyuan Wang, Yong Zhou, Wuping Ke, Desheng Zheng, Fan Min, Hui Li
{"title":"Harmful data enhanced anomaly detection for quasi-periodic multivariate time series","authors":"Liyuan Wang, Yong Zhou, Wuping Ke, Desheng Zheng, Fan Min, Hui Li","doi":"10.1007/s10489-025-06461-x","DOIUrl":"10.1007/s10489-025-06461-x","url":null,"abstract":"<div><p>Multivariate quasiperiodic time series (MQTS) anomaly detection has demonstrated significant potential across various practical applications, including health monitoring, intelligent maintenance, and quantitative trading. Recent research has introduced diverse methods based on autoencoders (AEs) and generative adversarial networks (GANs) that learn latent representations of normal data and subsequently detect anomalies through reconstruction errors. However, anomalous training set data can cause model pollution, which harms the ability to of the utilized model reconstruct normal data. The current data extreme imbalance creates an enormous challenge in terms of stripping out these anomalies. In this paper, we propose a GAN-based multivariate quasiperiodic time series anomaly detection method called IGANomaly (I represents isolation). This method isolates normal and harmful samples via pseudolabeling and then learns harmful data patterns to enhance the process of reconstructing of normal samples. First, the reconstruction error and potential feature distribution are jointly analyzed. Bimodal dynamic alignment is achieved through multiview clustering, thus overcoming the limitation of unidimensional determination. Second, dual reconstruction constraints for the generator and a gradient penalty mechanism for the discriminator are constructed. While maintaining the reconstruction quality achieved for normal samples, the propagation path of abnormal features is actively perturbed through a gradient inversion strategy. On three public datasets, IGANomaly achieves <span>(F1 scores)</span> of 0.811, 0.846, and 0.619, demonstrating an average improvement of 18.9% over the best baseline methods.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143688466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ResU-KAN: a medical image segmentation model integrating residual convolutional attention and atrous spatial pyramid pooling","authors":"Haibin Wang, Zhenfeng Zhao, Qi Liu, Shenwen Wang","doi":"10.1007/s10489-025-06467-5","DOIUrl":"10.1007/s10489-025-06467-5","url":null,"abstract":"<div><p>With the rapid growth of medical imaging data, precise segmentation and analysis of medical images face unprecedented challenges. Addressing small sample sizes, significant variations, and structurally complex medical imaging data to improve the accuracy of early diagnosis has become a key issue in the medical field. This study proposes a Residual U-KAN model (ResU-KAN) to tackle this challenge and improve medical image segmentation accuracy. First, to address the model’s shortcomings in capturing long-distance dependencies and issues like potential gradient vanishing (or explosion) and overfitting, we introduce a Residual Convolution Attention (RCA) module. Second, to expand the model’s receptive field while performing multi-scale feature extraction, we introduce an Atrous Spatial Pyramid Pooling module (ASPP). Finally, experiments were conducted on three publicly available medical imaging datasets, and comparative analysis with existing state-of-the-art methods demonstrated the effectiveness of the proposed approach. Project page: https://github.com/Alfreda12/ResU-KAN</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143688539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vision-based attention deep q-network with prior-based knowledge","authors":"Jialin Ma, Ce Li, Liang Hong, Kailun Wei, Shutian Zhao, Hangfei Jiang, Yanyun Qu","doi":"10.1007/s10489-024-05850-y","DOIUrl":"10.1007/s10489-024-05850-y","url":null,"abstract":"<div><p>Vision-based reinforcement learning (RL) is a potent algorithm for addressing tasks related to visual behavioural decision-making; nevertheless, it operates as a black-box, directly training models with images as input in the end-to-end fashion. Therefore, to elucidate the underlying mechanisms of the model and the agent’s focus on different features during the decision-making process, a vision-based attention (VA) mechanism is introduced into vision-based RL in this paper. A prior-based mechanism is introduced to address the issue of instability in the attention maps observed by the agent when attention mechanisms are directly integrated into network updates that results in an increase in single-step errors and larger cumulative errors. Thus, a vision-based attention deep Q-network (VADQN) method with a prior-based mechanism is proposed. Specifically, prior attention maps are obtained using a learnable Gaussian filtering and a spectral residual method. Next, the attention maps are fine-tuned using a self-attention (SA) mechanism to enhance their performance. During training, both the attention maps and the parameters of the policy network are concurrently trained to ensure explanations of the regions of interest during online training. Finally, a series of ablation experiments are conducted on Atari games to compare the proposed method with humans, convolutional neural networks, and other approaches. The results demonstrate that the proposed method not only reveals the regions of interest attended to by DRL during the decision-making process but also enhances DRL performance in certain scenarios. This approach provides valuable insights for understanding and improving the performance of DRL in visual decision-making tasks.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143688572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongmei Wang, Ping Li, Yuyan Zheng, Kun Jiang, Yitian Xu
{"title":"Sparse pinball Universum nonparallel support vector machine and its safe screening rule","authors":"Hongmei Wang, Ping Li, Yuyan Zheng, Kun Jiang, Yitian Xu","doi":"10.1007/s10489-025-06356-x","DOIUrl":"10.1007/s10489-025-06356-x","url":null,"abstract":"<div><p>Nonparallel support vector machine (NPSVM) is an effective and popular classification technique, which introduces the <span>(epsilon )</span>-insensitive loss function instead of the quadratic loss function in twin support vector machine (TSVM), making the model have the same sparsity and kernel strategy as support vector machine (SVM). However, NPSVM is sensitive to noise points and does not utilize the prior knowledge embedded in the unlabeled samples. Therefore, to improve its generalization ability and robustness, a sparse pinball Universum nonparallel support vector machine (SPUNPSVM) is first proposed in this paper. On the one hand, the sparse pinball loss is employed to enhance the robustness. On the other hand, it exploits the Universum data, which do not belong to any class, to embed prior knowledge into the model. Numerical experiments have verified its effectiveness. Furthermore, to further speed up SPUNPSVM, we propose a safe screening rule (SSR-SPUNPSVM) based on its sparsity, which achieves acceleration without sacrificing accuracy. Numerical experiments and statistical tests demonstrate the superiority of our SSR-SPUNPSVM.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dark-ControlNet: an enhanced dehazing universal plug-in based on the dark channel prior","authors":"Yu Yang, Xuesong Yin, Yigang Wang","doi":"10.1007/s10489-025-06439-9","DOIUrl":"10.1007/s10489-025-06439-9","url":null,"abstract":"<div><p>Existing dehazing models have excellent performance in synthetic scenes but still face the challenge of low robustness in real scenes. In this paper, we propose Dark-ControlNet, a generalized and enhanced dehazing plug-in that uses the dark channel prior as a control condition, which can be deployed on existing dehazing models and can be simply fine-tuned to enhance their robustness in real scenes while improving their dehazing performance. We first freeze the backbone network to preserve its encoding and decoding capabilities and input the dark channel prior with high robustness as conditional information to the plug-in network to obtain prior knowledge. Then, we fuse the dark channel prior features into the backbone network in the form of mean-variance alignment via the Haze&Dark(HD) module and guide the backbone network to decode clear images by fine-tuning the plug-in network. The experimental results show that the existing dehazing model enhanced by Dark-ControlNet performs well on synthetic datasets and real datasets.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mastering table tennis with hierarchy: a reinforcement learning approach with progressive self-play training","authors":"Hongxu Ma, Jianyin Fan, Haoran Xu, Qiang Wang","doi":"10.1007/s10489-025-06450-0","DOIUrl":"10.1007/s10489-025-06450-0","url":null,"abstract":"<div><p>Hierarchical Reinforcement Learning (HRL) is widely applied in various complex task scenarios. In complex tasks where simple model-free reinforcement learning struggles, hierarchical design allows for more efficient utilization of interactive data, significantly reducing training costs and improving training success rates. This study delves into the use of HRL based on the model-free policy layer to learn complex strategies for a robotic arm playing table tennis. Through processes such as pre-training, self-play training, and self-play training with top-level winning strategies, the robustness of the lower-level hitting strategies has been enhanced. Furthermore, a novel decay reward mechanism has been employed in the training of the higher-level agent to improve the win rate in adversarial matches against other methods. After pre-training and adversarial training, we achieved an average of 52 rally cycles for the forehand strategy and 48 rally cycles for the backhand strategy in testing. The high-level strategy training based on the decay reward mechanism resulted in an advantageous score when competing against other strategies.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An approach to software defect prediction for small-sized datasets","authors":"Pravas Ranjan Bal, Suyash Shukla, Sandeep Kumar","doi":"10.1007/s10489-025-06458-6","DOIUrl":"10.1007/s10489-025-06458-6","url":null,"abstract":"<div><p>Software defect prediction (SDP) is an active research subject in the software engineering domain. The earlier works on SDP use the same project’s data for prediction in future releases, called within-project defect prediction (WPDP). WPDP may not perform well when the data available for training is small in size. In this work, to address the issue of small-size data, we suggest enhancing the data by borrowing data from other software projects. For better prediction accuracy of learning models, both train and test data must follow the same distribution. However, this may not be true in the case of data being transferred from the other project. Data from different projects may follow different distributions. So, to handle this issue, we have proposed a data preprocessing method, namely data transfer-based WPDP (DT-WPDP). Next, we have shown the use of the deep neural network (DNN) for WPDP and compared it with other classical machine learning (ML) models such as k nearest neighbor, decision tree, logistic regression, and Naive Bayes classifiers. Further, we have performed experimental analysis to assess the effect of the proposed DT-WPDP data preprocessing method with DNN and other ML models. Experimental results show that the proposed approach significantly improves the accuracies of different models. Among different models, the DNN model performed best for all datasets. In the case of very small-sized datasets, which is our main concern in this work, the accuracy of the DNN model is improved by 7% after using the proposed approach.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143667805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semi-supervised text classification method based on three-way decision with evidence theory","authors":"Ziping Yang, Chunmao Jiang, Chunmei Huang","doi":"10.1007/s10489-024-06129-y","DOIUrl":"10.1007/s10489-024-06129-y","url":null,"abstract":"<div><p>Semi-supervised learning methods play a crucial role in text classification tasks. However, due to limitation of scarce labeled training data, the uncertainty of pseudo labels is still an unavoidable problem in semi-supervised text classification. To address this issue, this paper introduces three-way decision theory into semi-supervised text classification model, which divides the model output pseudo-labeled samples into different regions and adopts different processing strategies. The accurate and effective pseudo-labeled samples are selected as much as possible to expand the original training set. For the pseudo-labeled outputs by the model, we use evidence theory to fuse the probability outputs of the samples to improve the stability and credibility of pseudo labels. Experimental results demonstrate that the method introduced in this paper effectively enhances the accuracy of semi-supervised text classification while exhibiting high stability.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-024-06129-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143667878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}