Applied Soft Computing最新文献

筛选
英文 中文
Inter-block ladder-style transformer model with multi-subspace feature adjustment for object re-identification
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-03-08 DOI: 10.1016/j.asoc.2025.112961
Zhi Yu , Zhiyong Huang , Mingyang Hou , Jiaming Pei , Daming Sun
{"title":"Inter-block ladder-style transformer model with multi-subspace feature adjustment for object re-identification","authors":"Zhi Yu ,&nbsp;Zhiyong Huang ,&nbsp;Mingyang Hou ,&nbsp;Jiaming Pei ,&nbsp;Daming Sun","doi":"10.1016/j.asoc.2025.112961","DOIUrl":"10.1016/j.asoc.2025.112961","url":null,"abstract":"<div><div>Object re-identification aims to retrieve specific objects across multiple cameras and has garnered significant attention. Currently, transformer-based methods have taken a dominant position. However, most approaches embed inherent transformer encoders for feature extraction directly. These methods handle all patch tokens uniformly, failing to distinguish salient and non-salient patch tokens for discriminative feature expression. To this end, this work proposes a novel inter-block ladder-style transformer (IBLSFormer) for object re-identification. Firstly, a multi-subspace feature adjustment (MSFA) module is designed to adjust the patch features via class-patch interaction in multiple subspaces including Euclidean distance subspace, cosine distance subspace, and KL divergence subspace. The MSFA module can enhance the salient patch tokens and weaken the non-salient patch tokens simultaneously to focus on discriminative patches. Afterwards, an IBLSFormer is designed by inserting MSFA modules with distinct configurations into the vision transformer. The narrow-to-wide ladder-style constraints are embedded in MSFA modules based on embedding depth to highlight the feature differences across different levels and ameliorate the feature learning. Our method achieves mAP/Rank-1 of 88.7%/95.3%, 81.4%/90.4%, 80.0%/97.1%, and 89.4%/83.6% on four object re-identification datasets. Extensive experiments show IBLSFormer is superior to other methods in learning discriminative and robust representations for object re-identification.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"174 ","pages":"Article 112961"},"PeriodicalIF":7.2,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143601317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A method based on Generative Adversarial Networks for disentangling physical and chemical properties of stars in astronomical spectra
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-03-08 DOI: 10.1016/j.asoc.2025.112954
Raúl Santoveña , Carlos Dafonte , Minia Manteiga
{"title":"A method based on Generative Adversarial Networks for disentangling physical and chemical properties of stars in astronomical spectra","authors":"Raúl Santoveña ,&nbsp;Carlos Dafonte ,&nbsp;Minia Manteiga","doi":"10.1016/j.asoc.2025.112954","DOIUrl":"10.1016/j.asoc.2025.112954","url":null,"abstract":"<div><div>This work presents the design of an autoencoder architecture that uses adversarial training in the context of astrophysical spectral analysis. We aim to develop a middle representation of stellar spectra in which the influence of the most prominent physical properties, such as surface temperature and gravity, is effectively removed. This allows the variance within the representation to primarily reflect the effects of the star’s chemical composition on the spectrum. We apply a scheme of deep learning to unravel in the latent space the desired parameters of the rest of the information contained in the data. This work proposes a version of adversarial training that uses one discriminator per parameter to be disentangled, avoiding the exponential combination that occurs when using a single discriminator. Synthetic astronomical data from the APOGEE and Gaia surveys were used to test the method’s effectiveness. Our approach demonstrates a marked improvement in disentangling, reflected in an improvement in the <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> score of up to 0.7. Additionally, we introduce an ad-hoc framework, GANDALF, designed to facilitate visualization and adaptation of the methodology to other domains in astronomical spectroscopy.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"174 ","pages":"Article 112954"},"PeriodicalIF":7.2,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143643855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph convolutional networks with multi-scale dynamics for traffic speed forecasting
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-03-08 DOI: 10.1016/j.asoc.2025.112966
Dongping Zhang , Hao Lan , Mengting Wang , Jiabin Yu , Xinghao Jiang , Shifeng Zhang
{"title":"Graph convolutional networks with multi-scale dynamics for traffic speed forecasting","authors":"Dongping Zhang ,&nbsp;Hao Lan ,&nbsp;Mengting Wang ,&nbsp;Jiabin Yu ,&nbsp;Xinghao Jiang ,&nbsp;Shifeng Zhang","doi":"10.1016/j.asoc.2025.112966","DOIUrl":"10.1016/j.asoc.2025.112966","url":null,"abstract":"<div><div>Accurate traffic speed forecasting remains challenging due to complex and variable road conditions. Prior research often overlooks both coarse-grained and fine-grained features in traffic data, hindering a comprehensive capture of traffic data's temporal dependencies. While graph convolutional networks (GCNs) are commonly employed to extract spatial dependencies in traffic networks, they typically view these networks from a static standpoint, failing to consider the dynamic nature of traffic network structures. This limitation restricts their effectiveness in modeling traffic networks. To address these issues, this study introduces a novel deep learning-based spatial-temporal model for precise traffic speed forecasting. This model incorporates a newly developed multi-scale transformation method, which enhances the coarse-grained and fine-grained features in traffic speed data by transforming and fusing traffic speed data, and enabling a more thorough modeling of its temporal dependencies. Additionally, we propose an innovative graph interaction strategy, combines the generated graphs with a dynamic graph convolutional network, to effectively mine the dynamic characteristics of traffic network structures, thereby enhancing the model's accuracy. Extensive experiments on two real-world datasets have demonstrated the robustness and superior performance of the proposed model, with improvements ranging from 2.2 % to 6.1 % over state-of-the-art baselines.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"174 ","pages":"Article 112966"},"PeriodicalIF":7.2,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143686930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Art style classification via self-supervised dual-teacher knowledge distillation
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-03-08 DOI: 10.1016/j.asoc.2025.112964
Mei Luo , Li Liu , Yue Lu , Ching Y. Suen
{"title":"Art style classification via self-supervised dual-teacher knowledge distillation","authors":"Mei Luo ,&nbsp;Li Liu ,&nbsp;Yue Lu ,&nbsp;Ching Y. Suen","doi":"10.1016/j.asoc.2025.112964","DOIUrl":"10.1016/j.asoc.2025.112964","url":null,"abstract":"<div><div>Art style classification plays a crucial role in computational aesthetics. Traditional deep learning-based methods for art style classification typically require a large number of labeled images, which are scarce in the art domain. To address this challenge, we propose a self-supervised learning method specifically tailored for art style classification. Our method effectively learns image style features using unlabeled images. Specifically, we introduce a novel self-supervised learning approach based on the popular contrastive learning framework, incorporating a unique dual-teacher knowledge distillation technique. The two teacher networks provide complementary guidance to the student network. Each teacher network focuses on extracting distinct features, offering diverse perspectives. This collaborative guidance enables the student network to learn detailed and robust representations of art style attributes. Furthermore, recognizing the Gram matrix’s capability to capture image style through feature correlations, we explicitly integrate it into our self-supervised learning framework. We propose a relation alignment loss to train the network, leveraging image relationships. This loss function has shown promising results compared to the commonly used InfoNCE loss. To validate our proposed method, we conducted extensive experiments on three publicly available datasets: WikiArt, Pandora18k, and Flickr. The experimental results demonstrate the superiority of our method, significantly outperforming state-of-the-art self-supervised learning methods. Additionally, when compared with supervised methods, our approach shows competitive results, notably surpassing supervised learning methods on the Flickr dataset. Ablation experiments further verify the efficacy of each component of our proposed network. The code is publicly available at: <span><span>https://github.com/lm-oc/dual_signal_gram_matrix</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"174 ","pages":"Article 112964"},"PeriodicalIF":7.2,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143592746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An adaptive heuristic algorithm with a collaborative search framework for multi-UAV inspection planning
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-03-08 DOI: 10.1016/j.asoc.2025.112969
Chang He , Haibin Ouyang , Weiqing Huang , Steven Li , Chunliang Zhang , Weiping Ding , Zhi-Hui Zhan
{"title":"An adaptive heuristic algorithm with a collaborative search framework for multi-UAV inspection planning","authors":"Chang He ,&nbsp;Haibin Ouyang ,&nbsp;Weiqing Huang ,&nbsp;Steven Li ,&nbsp;Chunliang Zhang ,&nbsp;Weiping Ding ,&nbsp;Zhi-Hui Zhan","doi":"10.1016/j.asoc.2025.112969","DOIUrl":"10.1016/j.asoc.2025.112969","url":null,"abstract":"<div><div>Multi-UAV inspection path planning has become an important research topic for completing inspection tasks before the data acquisition deadline. In this study, we propose an adaptive heuristic algorithm with a collaborative search framework named Sa-VCO to solve the multi-UAV inspection path planning problem. Our study includes three main novelties. First, we design a region-gridding disperse approach that transforms the primitive target regions into a set of standard target subregions, allowing the target regions with greater costs to be inspected by multiple UAVs. Second, we propose an adaptive initial solution generation strategy using the information of graph structure constructed by all targets to reduce redundant computing. Third, we established a collaborative search framework to enhance search efficiency and increase population diversity. A large number of multiple-perspective comparative experiments are provided to test Sa-VCO's performance, and the comparison results demonstrate that Sa-VCO achieves better results than other advanced algorithms, especially on large-scale data sets.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"174 ","pages":"Article 112969"},"PeriodicalIF":7.2,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143610994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of international market entry strategies for mineral oil companies using a neutrosophic SWARA-CRADIS methodology
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-03-08 DOI: 10.1016/j.asoc.2025.112976
Ahmet Aytekin , Hilal Öztürk Küçük , Makbule Aytekin , Vladimir Simic , Dragan Pamucar
{"title":"Evaluation of international market entry strategies for mineral oil companies using a neutrosophic SWARA-CRADIS methodology","authors":"Ahmet Aytekin ,&nbsp;Hilal Öztürk Küçük ,&nbsp;Makbule Aytekin ,&nbsp;Vladimir Simic ,&nbsp;Dragan Pamucar","doi":"10.1016/j.asoc.2025.112976","DOIUrl":"10.1016/j.asoc.2025.112976","url":null,"abstract":"<div><div>Businesses encounter risks when entering new countries, but there are also opportunities. The strategy a business employs when opting to enter a new market is directly tied to its success. In this regard, the study provides an approach for evaluating new market entry strategies for businesses facilitating in the mineral oil sector and manufacturing industries. The approach includes the development of the type 2 neutrosophic step-wise weight assessment ratio analysis (SWARA)-compromise ranking of alternatives from distance to ideal solution (CRADIS) methodology, which aims to solve the problem by determining the candidate strategies and the criteria to be utilized in their evaluation. The findings revealed that market conditions are the most crucial criterion in selecting strategies for mineral oil companies intending to enter new markets. The magnitude and development potential of the new market to be entered, as well as the status of the actors, all have an impact on market conditions, which are critical for businesses. Moreover, foreign direct investment is found to be the best market entry strategy. This strategy arises because businesses aim to maximize the potential in the market they have recently entered, as well as other factors such as government incentives. The study is expected to benefit production enterprises, the mineral oil sector, the marketing field, and the literature by identifying criteria and option sets, finding the importance degrees of the criteria, selecting the ideal entry strategy, and proposing a methodology for handling uncertain data.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"174 ","pages":"Article 112976"},"PeriodicalIF":7.2,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143610996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficiency analysis of binary metaheuristic optimization algorithms for uncapacitated facility location problems
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-03-08 DOI: 10.1016/j.asoc.2025.112968
Tahir Sag , Aysegul Ihsan
{"title":"Efficiency analysis of binary metaheuristic optimization algorithms for uncapacitated facility location problems","authors":"Tahir Sag ,&nbsp;Aysegul Ihsan","doi":"10.1016/j.asoc.2025.112968","DOIUrl":"10.1016/j.asoc.2025.112968","url":null,"abstract":"<div><div>This paper introduces binary adaptations of four metaheuristic optimization algorithms: the Binary Coati Optimization Algorithm (BCOA), Binary Mexican Axolotl Optimization Algorithm (BMAO), Binary Dynamic Hunting Leadership Optimization (BDHL), and Binary Aquila Optimizer (BAO). These algorithms were evaluated for their effectiveness in solving Uncapacitated Facility Location (UFL) problems, which aim to minimize total costs associated with customer-facility allocations and facility opening expenses by determining the optimal number of open facilities. Using 15 UFL problem instances from the OR-Lib dataset, the study assessed algorithm performance across 17 transfer functions (TFs), including S-shaped, V-shaped, and other variants, to address the binary nature of these problems. Performance metrics such as the best, worst, average, standard deviation, and GAP values were analyzed for each binary algorithm. Additionally, statistical analyses were conducted to further assess algorithmic performance. The Kolmogorov-Smirnov (KS) normality test was applied to determine the distribution characteristics of the results, followed by either ANOVA or Kruskal-Wallis tests, depending on the normality of the distributions. These statistical tests revealed significant differences in algorithm performance across different problem instances. Rank values were calculated based on GAP values and CPU times to facilitate comparisons across algorithm versions for the 15 UFL problems. Results underscored the critical role of TF selection in optimizing algorithm efficiency: BCOA performed best with TF11, BMAO with TF16 and TF17, BAO with TF10, and BDHL with TF15. Finally, a performance comparison on GAP values was conducted with two state-of-the-art PSO variants adapted for binary optimization. The proposed algorithms demonstrated either superior or competitive performance in solving UFL problems, validating their efficacy in complex optimization tasks and highlighting the influence of TFs on their performance.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"174 ","pages":"Article 112968"},"PeriodicalIF":7.2,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143601319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Grey prediction evolution algorithm with a dominator guidance strategy for solving multi-level image thresholding
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-03-08 DOI: 10.1016/j.asoc.2025.112947
Peixin Yang , Zhongbo Hu , Yang Zhou , Qinghua Su , Wentao Xiong
{"title":"Grey prediction evolution algorithm with a dominator guidance strategy for solving multi-level image thresholding","authors":"Peixin Yang ,&nbsp;Zhongbo Hu ,&nbsp;Yang Zhou ,&nbsp;Qinghua Su ,&nbsp;Wentao Xiong","doi":"10.1016/j.asoc.2025.112947","DOIUrl":"10.1016/j.asoc.2025.112947","url":null,"abstract":"<div><div>Multi-level thresholding (MLT) stands as a pivotal method for extracting target information from images. Meta-heuristic algorithms provide an efficient way to implement MLT and retains more research space for accuracy optimization of high-dimensional multi-level thresholding (HDMLT) of images than they do for low-dimensional multi-level thresholding (LDMIT). In order to improve the algorithmic accuracy in solving the high-dimensional problems, a grey prediction evolution algorithm with a dominator guidance strategy (GPEdg) is proposed in this paper. GPEdg employs Otsu’s method as its objective function to find the best threshold configuration. The novel operator in the algorithm, i.e., a dominator guidance (dg) strategy, uses a linear combination of three difference vectors to guide the top 50% individuals of populations to learn from the top 20% of them. An efficient balance of search abilities suitable for solving HDMLT problems is expected to be achieved by injecting the local search capability of the dg strategy into GPE’s powerful global search capability. Furthermore, a thresholding morphological profile based method (TMP) leverages the thresholding results generated by GPEdg to train a support vector machine (SVM) for hyperspectral image classification. Numerical experiments are conducted for the newly proposed algorithm and five state-of-the-art algorithms on three image datasets to compare the performance in six metrics, i.e., peak signal-to-noise ratio, structural similarity index, features similarity index, objective function value, stability and time consumption. Overall accuracy and average accuracy are tested on two commonly used hyperspectral image data. The results show that GPEdg exhibits outstanding thresholding performance while TMP enhances the classification accuracy of these images. If this paper is accepted, Matlab_codes associated with this paper will be uploaded to <span><span>https://github.com/Zhongbo-Hu/Prediction-Evolutionary-Algorithm-HOMEPAGE</span><svg><path></path></svg></span></div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"174 ","pages":"Article 112947"},"PeriodicalIF":7.2,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143610989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiplex network influence maximization based on representation learning method
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-03-08 DOI: 10.1016/j.asoc.2025.112956
Hegui Zhang , Dapeng Zhang , Yun Wan , Renbin Pan , Gang Kou
{"title":"Multiplex network influence maximization based on representation learning method","authors":"Hegui Zhang ,&nbsp;Dapeng Zhang ,&nbsp;Yun Wan ,&nbsp;Renbin Pan ,&nbsp;Gang Kou","doi":"10.1016/j.asoc.2025.112956","DOIUrl":"10.1016/j.asoc.2025.112956","url":null,"abstract":"<div><div>Influence maximization based on representation learning has garnered significant attention in recent years, with numerous studies focusing on monolayer networks. However, given the inherent complexity and multiplicity of social networks, addressing the Multiplex network Influence Maximization (MIM) problem is more practical. The MIM problem aims to find a set of seed nodes to maximize the spread of influence throughout the multiplex network. To tackle this issue, this paper introduces a reverse random walk centrality method based on multiplex network representation learning. This method leverages multiplex network representation learning to derive node embeddings across different layers of the network. By calculating similarity weights between nodes within each layer, a reverse random walk is performed to quantify node importance based on the frequency of visits. The top-k nodes with the highest visit counts are then selected as seed nodes. Both single influence propagation and a coupled spread model that integrates competitive and cooperative influence dynamics are considered. Extensive experiments on several real-world datasets demonstrate that the proposed method outperforms existing techniques in terms of effectiveness, providing robust seed node selection for influence maximization. These findings highlight the efficiency and applicability of the proposed method for practical multiplex network scenarios.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"174 ","pages":"Article 112956"},"PeriodicalIF":7.2,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143592745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An adaptation framework with unified embedding reconstruction for cross-corpus speech emotion recognition
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-03-07 DOI: 10.1016/j.asoc.2025.112948
Ruiteng Zhang , Jianguo Wei , Xugang Lu , Yongwei Li , Wenhuan Lu , Lin Zhang , Junhai Xu
{"title":"An adaptation framework with unified embedding reconstruction for cross-corpus speech emotion recognition","authors":"Ruiteng Zhang ,&nbsp;Jianguo Wei ,&nbsp;Xugang Lu ,&nbsp;Yongwei Li ,&nbsp;Wenhuan Lu ,&nbsp;Lin Zhang ,&nbsp;Junhai Xu","doi":"10.1016/j.asoc.2025.112948","DOIUrl":"10.1016/j.asoc.2025.112948","url":null,"abstract":"<div><div>It is challenging for speech emotion recognition (SER) to maintain robustness under cross-domain scenarios. Unsupervised domain adaptation (UDA) algorithms have been explored to address the domain shift in SER without relying on emotion labels in the target domain. As a promising framework in UDAs, self-supervised learning (SSL)-based domain exploration (SDE) investigates the domain and structural information within the target domain, aligning domain discrepancies while preserving the model’s emotion discrimination capability. However, SSL often inadvertently introduces emotion-irrelevant information, adversely affecting the UDA performance. To resolve this, we introduce a novel UDA framework called unified SDE (U-SDE), where both source and target domains conduct a unified SSL task. In the source domain, U-SDE guides the source SSL to focus on emotion-related information due to supervised emotion classification constraints. Simultaneously, in the target domain, shared network weights enable the target SSL branch to concentrate on intrinsic emotional and domain features. However, simply using existing SSL algorithms to implement this framework might disrupt the training of the supervised SER branch. To overcome this, we propose the embedding reconstruction of masked speech (ERMS) algorithm. In ERMS, the emotion encoder transforms the embedding of the masked speech to match the embedding of its corresponding unmasked speech, thereby capturing the emotion discriminative feature within the sample. Finally, we employ ERMS to realize the proposed U-SDE paradigm, termed unified ERMS (U-ERMS). We conducted systematic cross-domain SER experiments by designing 52 scenarios using seven well-known datasets. Experimental results showed that the proposed U-ERMS achieved state-of-the-art performance in cross-domain SERs.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"174 ","pages":"Article 112948"},"PeriodicalIF":7.2,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143592744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信