{"title":"A non-parameter oversampling approach for imbalanced data classification based on hybrid natural neighbors","authors":"Junyue Lin, Lu Liang","doi":"10.1007/s10489-025-06236-4","DOIUrl":"10.1007/s10489-025-06236-4","url":null,"abstract":"<div><p>In recent years, researchers have developed numerous interpolation-based oversampling techniques to tackle class imbalance in classification tasks. However, most existing techniques encounter the challenge of k parameter due to the involvement of k nearest neighbor (kNN). Furthermore, they only adopt one sole neighborhood rule, disregarding the positional characteristics of minority samples. This often leads to the generation of synthetic noise or overlapping samples. This paper proposes a non-parameter oversampling framework called the hybrid natural neighbor synthetic minority oversampling technique (HNaNSMOTE). HNaNSMOTE effectively determines an appropriate k value through iterative search and adopts a hybrid neighborhood rule for each minority sample to generate more representative and diverse synthetic samples. Specifically, 1) a hybrid natural neighbor search procedure is conducted on the entire dataset to obtain a data-related k value, which eliminates the need for manually preset parameters. Different natural neighbors are formed for each sample to better identify the positional characteristics of minority samples during the procedure. 2) To improve the quality of the generated samples, the hybrid natural neighbor (HNaN) concept has been proposed. HNaN utilizes kNN and reverse kNN to find neighbors adaptively based on the distribution of minority samples. It is beneficial for mitigating the generation of synthetic noise or overlapping samples since it takes into account the existence of majority samples. Experimental results on 32 benchmark binary datasets with three classifiers demonstrate that HNaNSMOTE outperforms numerous state-of-the-art oversampling techniques for imbalanced classification in terms of Sensitivity and G-mean.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 5","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142995666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Knowledge guided relation enhancement for human-object interaction detection","authors":"Rui Su, Yongbin Gao, Wenjun Yu, Chenmou Wu, Xiaoyan Jiang, Shubo Zhou","doi":"10.1007/s10489-025-06279-7","DOIUrl":"10.1007/s10489-025-06279-7","url":null,"abstract":"<div><p>The Human-Object Interaction (HOI) detection task aims to locate humans and objects, find their matching relationships, and infer their interactions. While existing HOI methods have leveraged the CLIP model, a pre-trained visual-language model capable of understanding both images and text, to improve performance, they still fall short in fully capturing the complexity and fine-grained details of human-object interactions. As a result, their ability to reason about interactions accurately and in-depth remains limited. Therefore, we propose a knowledge-guided interaction perception module that combines multiple relationship information with CLIP’s visual feature information. Then, we utilize prior interaction knowledge from intersection regions to guide the process, resulting in more accurate human-object interaction detection. Moreover, we find that the potential interaction of images relies on subtle visual cues but is masked by other irrelevant information, making it difficult for algorithms to capture the basic features of interaction accurately. To address this, we have designed a human-object salient region enhancement module to enhance the feature information of humans and objects and enable better interaction pairing. Experimental results demonstrate that our method with knowledge guided (KGRE) achieves state-of-the-art performance on both the HICO-DET and V-COCO benchmark datasets.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142995636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Insulator defect detection from aerial images in adverse weather conditions","authors":"Song Deng, Lin Chen, Yi He","doi":"10.1007/s10489-025-06280-0","DOIUrl":"10.1007/s10489-025-06280-0","url":null,"abstract":"<div><p>Insulators are a key equipment in power systems. Regular detection of defects in the insulator surface and replacement of defective insulators in time are a must for the operation of the safety system. Whereas manual inspection remains a common practice, the recent maturity of unmanned aerial vehicle(UAV) and artificial intelligence(AI) techniques leads the electrical industry to envision an automated, real-time insulator defect detector. However, the existing detection models mainly operate in very limited weather condition, faltering in generalization and practicality in the wild. To aid in the status quo, this paper proposes a new framework that enables accurate detection of insulator defects in adverse weather conditions, where atmospheric particulates can substantially degrade the quality of aerial images on insulator surfaces. Our proposed framework is embarrassingly simple, yet effective. Specifically, it integrates progressive recurrent network(PReNet) and DehazeFormer to derain and dehaze the noisy aerial images, respectively, and tailors you only look once version 7(YOLOv7) with a new structured intersection over union(SIoU) loss function and similarity-based attention module(SimAM) to expedite convergence with better deep feature extraction. Two new benchmark datasets, Chinese power line insulator dataset(CPLID)_Rainy and CPLID_Hazy, are developed for empirical evaluation, and the comparative study substantiates the viability and effectiveness of our proposed framework. We share our code and dataset at https://github.com/CHLNK/Insulator-defect-detection.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142995579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A modified dueling DQN algorithm for robot path planning incorporating priority experience replay and artificial potential fields","authors":"Chang Li, Xiaofeng Yue, Zeyuan Liu, Guoyuan Ma, Hongbo Zhang, Yuan Zhou, Juan Zhu","doi":"10.1007/s10489-024-06149-8","DOIUrl":"10.1007/s10489-024-06149-8","url":null,"abstract":"<div><p>For the challenges of low learning efficiency, slow convergence speed and slow inference speed in robot path planning. This paper proposes an improved deep reinforcement learning algorithm for robot path planning. Firstly, the Dueling DQN network architecture is employed, combined with a priority experience replay strategy, to more effectively learn from and utilize experience data. Secondly, the mobility space of the robot is expanded, enhancing the diversity and flexibility of the action space. Additionally, in the action selection process, the Artificial Potential Field (APF) algorithm is introduced to intervene in the action selection with a certain probability, thereby accelerating the convergence process of the network. Simultaneously, the <span>(varepsilon)</span> -greedy strategy is employed to balance exploration and exploitation, facilitating better exploration of the environment and utilization of existing knowledge. Furthermore, this paper devises novel composite reward functions that comprehensively integrate multiple reward mechanisms to enhance the convergence performance of the algorithm and the quality of path planning. Finally, the effectiveness and superiority of the proposed method are validated through detailed comparative simulations. Compared to traditional DQN algorithms, Double DQN, and Double DQN with the APF strategy, the method proposed in this paper demonstrates higher learning efficiency and faster convergence speed, enabling more effective planning of shorter paths.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142995637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A review of the emotion recognition model of robots","authors":"Mingyi Zhao, Linrui Gong, Abdul Sattar Din","doi":"10.1007/s10489-025-06245-3","DOIUrl":"10.1007/s10489-025-06245-3","url":null,"abstract":"<div><p>Being able to experience emotions is a defining characteristic of machine intelligence, and the first step in giving robots emotions is to enable them to accurately recognize and understand human emotions. The initial task to achieve this is to quantify abstract human emotions into concrete data. Combining this with deep learning techniques, a variety of machine models for recognizing human emotions can be constructed to achieve efficient human-robot interaction. Along this line of thought, this paper comprehensively combs through the development paths of emotion quantification, emotion modeling, and machine emotion recognition models based on various signals with practical examples. We focus on summarizing the machine emotion recognition models in recent years, classifying them into four broad categories according to the input signals: vision-based, language-based, physiological signal-based emotion recognition models and multimodal emotion recognition models for in-depth discussion, revealing the strengths and weaknesses of these models and potential application scenarios.In particular, this study identifies multimodal emotion recognition models as a key direction for future research, which significantly improve recognition accuracy and robustness by integrating multiple data sources. Finally, the article discusses the challenges and improvement directions for emotion recognition models, providing an important reference for promoting intelligent and emotional human-computer interaction. Figure 1. shows the framework of this paper.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142995580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extended topic classification utilizing LDA and BERTopic: A call center case study on robot agents and human agents","authors":"Nevra Kazanci","doi":"10.1007/s10489-024-06106-5","DOIUrl":"10.1007/s10489-024-06106-5","url":null,"abstract":"<div><p>There are two ways to know why customers call the center: from the predetermined calling reason said by the customer to a Robot Agent (RA) before service with a Human Agent (HA) or directly from the customer’s conversation with an HA during the service. Obtaining tags by telling the call reason is easy, but customers can choose the wrong service operation at a non-negligible rate. So, this study used the data from 20,000 Turkish phone conversations with a HA at an inbound call center in the electronic products sector, which are handled for topic extraction with Latent Dirichlet Allocation (LDA) and Bidirectional Encoder Representations from Transformers Topic (BERTopic) topic modeling. First, the customer speeches converted to text received from the system were passed through cleaning and editing typos. Then, the models were created, and the topic extraction process was performed. LDA and BERTopic algorithms were evaluated by comparing the machine learning technology results of the call center with HA and RA. The topics covered were used for classification with Light Gradient Boosting Machine (LGBM) linear Support Vector Machines (SVM), Long Short Term Memory (LSTM), and Logistic Regression (LR). The classification and statistical test results showed that LDA is more successful than the guided BERTopic algorithm. In addition, LDA-based classification was also more successful than RA-based classification. Although LDA-based LSTM and LR algorithms were superior to others, the best performance according to accuracy score belongs to LDA-based LSTM.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 5","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142995661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Neural network adaptive terminal sliding mode trajectory tracking control for mechanical leg systems with uncertainty","authors":"Minbo Chen, Likun Hu, Zifeng Liao","doi":"10.1007/s10489-025-06228-4","DOIUrl":"10.1007/s10489-025-06228-4","url":null,"abstract":"<div><p>This paper proposes an adaptive terminal sliding mode control based on neural block approximation for mechanical leg systems characterized by uncertainty and external disturbances. This control is based on a dynamic model of the mechanical leg and introduces an ideal system trajectory as a constraint. The structure of the paper is as follows. First, the RBF neural network is used to approximate the parameters of the dynamic model in blocks. This process is supplemented with a nonsingular terminal sliding mode surface to accelerate the convergence of tracking errors, and an adaptive law is used to adjust weights online to reconstruct the mechanical leg model. Next, an integral sliding mode control robust component is provided to mitigate external disturbances and correct model inaccuracies. Within this step, the Lyapunov method is used to prove the finite-time stability and uniform boundedness of the control system. Finally, the algorithm is validated and tested using the CAPACE rapid control system on a three-degree-of-freedom mechanical leg platform. The experimental results show that the proposed RBFTSM algorithm performs well in the performance evaluation of the MASE and RMSE values, with high trajectory tracking accuracy, anti-interference ability and strong robustness. Further evidence is presented to demonstrate the effectiveness and practicality of the proposed method.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 5","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142995551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CrowdFPN: crowd counting via scale-enhanced and location-aware feature pyramid network","authors":"Ying Yu, Feng Zhu, Jin Qian, Hamido Fujita, Jiamao Yu, Kangli Zeng, Enhong Chen","doi":"10.1007/s10489-025-06263-1","DOIUrl":"10.1007/s10489-025-06263-1","url":null,"abstract":"<div><p>Crowd counting has emerged as a prevalent research direction within computer vision, focusing on estimating the number of pedestrians in images or videos. However, existing methods tend to ignore crowd location information and model efficiency, leading to reduced accuracy due to challenges such as multi-scale variations and intricate background interferences. To address these issues, we propose the scale-enhanced and location-aware feature pyramid network for crowd counting (CrowdFPN). First, it can fine-tune each feature layer to focus more on crowd objects within a specific scale through the Scale Enhancement Module. Then, feature information from different layers is effectively fused using the lightweight Adaptive Bi-directional Feature Pyramid Network. Recognizing the importance of crowd location information for accurate counting, we introduce the Location Awareness Module, which embeds crowd location data into the channel attention mechanism while mitigating the effects of complex background interference. Finally, extensive experiments on four popular crowd counting datasets demonstrate the effectiveness of the proposed model. The code is available at https://github.com/zf990312/CrowdFPN.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 5","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142995600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenwen Zhao, Zhisheng Yang, Song Yu, Shiyu Zhu, Li Li
{"title":"Contrastive pre-training and instruction tuning for cross-lingual aspect-based sentiment analysis","authors":"Wenwen Zhao, Zhisheng Yang, Song Yu, Shiyu Zhu, Li Li","doi":"10.1007/s10489-025-06251-5","DOIUrl":"10.1007/s10489-025-06251-5","url":null,"abstract":"<div><p>In Natural Language Processing (NLP), aspect-based sentiment analysis (ABSA) has always been one of the critical research areas. However, due to the lack of sufficient sentiment corpora in most languages, existing research mainly focuses on English texts, resulting in limited studies on multilingual ABSA tasks. In this paper, we propose a new pre-training strategy using contrastive learning to improve the performance of cross-lingual ABSA tasks, and we construct a semantic contrastive loss to align parallel sentence representations with the same semantics in different languages. Secondly, we introduce instruction prompt template tuning, which enables the language model to fully understand the task content and learn to generate the required targets through manually constructed instruction prompt templates. During the generation process, we create a more generic placeholder template-based structured output target to capture the relationship between aspect term and sentiment polarity, facilitating cross-lingual transfer. In addition, we have introduced a copy mechanism to improve task performance further. We conduct detailed experiments and ablation analyzes on eight languages to demonstrate the importance of each of our proposed components.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 5","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142995549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xunlian Wu, Da Teng, Han Zhang, Jingqi Hu, Yining Quan, Qiguang Miao, Peng Gang Sun
{"title":"Graph reconstruction and attraction method for community detection","authors":"Xunlian Wu, Da Teng, Han Zhang, Jingqi Hu, Yining Quan, Qiguang Miao, Peng Gang Sun","doi":"10.1007/s10489-024-05858-4","DOIUrl":"10.1007/s10489-024-05858-4","url":null,"abstract":"<div><p>Community detection as one of the hot issues in complex networks has attracted a large amount of attention in the past several decades. Although many methods perform well on this problem, they become incapable if the networks exhibit more complicated characteristics, e.g. strongly overlapping communities. This paper explores a graph reconstruction and attraction method (GRAM) for community detection. In GRAM, we extract network structure information of a graph by introducing a new passing probability matrix based on Markov Chains by which a new graph is further reconstructed, and modularity optimization is adopted on the reconstructed one instead of the original one for non-overlapping community detection. For identifying overlapping communities, we first initialize a cluster with a vital node as an origin of attraction, then the cluster is extended by graph attraction based on the passing probability. This procedure is repeated for the remaining nodes, and each isolated node if exists is finally classified into its most attractable cluster. Experiments on artificial and real-world datasets have shown the superiority of the proposed method for community detection particularly on the datasets with even more complex, sparse and ambiguous network structures.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 5","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142995599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}