Expert Systems with Applications最新文献

筛选
英文 中文
Application of Soft Actor-Critic algorithms in optimizing wastewater treatment with time delays integration
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-03-14 DOI: 10.1016/j.eswa.2025.127180
Esmaeel Mohammadi , Daniel Ortiz-Arroyo , Aviaja Anna Hansen , Mikkel Stokholm-Bjerregaard , Sébastien Gros , Akhil S. Anand , Petar Durdevic
{"title":"Application of Soft Actor-Critic algorithms in optimizing wastewater treatment with time delays integration","authors":"Esmaeel Mohammadi ,&nbsp;Daniel Ortiz-Arroyo ,&nbsp;Aviaja Anna Hansen ,&nbsp;Mikkel Stokholm-Bjerregaard ,&nbsp;Sébastien Gros ,&nbsp;Akhil S. Anand ,&nbsp;Petar Durdevic","doi":"10.1016/j.eswa.2025.127180","DOIUrl":"10.1016/j.eswa.2025.127180","url":null,"abstract":"<div><div>Wastewater treatment plants face unique challenges for process control due to their complex dynamics, slow time constants, and stochastic delays in observations and actions. These characteristics make conventional control methods, such as Proportional-Integral-Derivative controllers, suboptimal for achieving efficient phosphorus removal, a critical component of wastewater treatment to ensure environmental sustainability. This study addresses these challenges using a novel deep reinforcement learning approach based on the Soft Actor-Critic algorithm, integrated with a custom simulator designed to model the delayed feedback inherent in wastewater treatment plants. The simulator incorporates Long Short-Term Memory networks for accurate multi-step state predictions, enabling realistic training scenarios. To account for the stochastic nature of delays, agents were trained under three delay scenarios: no delay, constant delay, and random delay. The results demonstrate that incorporating random delays into the reinforcement learning framework significantly improves phosphorus removal efficiency while reducing operational costs. Specifically, the delay-aware agent achieved <strong>36</strong>% reduction in phosphorus emissions, <strong>55</strong>% higher reward, <strong>77</strong>% lower target deviation from the regulatory limit, and <strong>9</strong>% lower total costs than traditional control methods in the simulated environment. These findings underscore the potential of reinforcement learning to overcome the limitations of conventional control strategies in wastewater treatment, providing an adaptive and cost-effective solution for phosphorus removal.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"277 ","pages":"Article 127180"},"PeriodicalIF":7.5,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143681657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A community-aware graph neural network applied to geographical location-based representation learning and clustering within GIS
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-03-14 DOI: 10.1016/j.eswa.2025.127252
Phu Pham , Loan T.T. Nguyen , Hoai Thuong Sarah , Anh Nguyen , Trang T.D. Nguyen , Bay Vo
{"title":"A community-aware graph neural network applied to geographical location-based representation learning and clustering within GIS","authors":"Phu Pham ,&nbsp;Loan T.T. Nguyen ,&nbsp;Hoai Thuong Sarah ,&nbsp;Anh Nguyen ,&nbsp;Trang T.D. Nguyen ,&nbsp;Bay Vo","doi":"10.1016/j.eswa.2025.127252","DOIUrl":"10.1016/j.eswa.2025.127252","url":null,"abstract":"<div><div>For many years, geographical location-based clustering has continuously become a hot research topic in both geographic information systems (GIS) and machine learning (ML)/deep learning (DL) domains due to its applications for various real-world problems. Spatial density-based clustering methods are generally utilized to detect clusters from geographical data. Traditional spatial clustering methods encounter considerable limitations, especially when addressing sparse coordinate data and substantial fluctuations in geographic cluster density. To overcome these limitations, in this paper, we introduce a novel <u>c</u>ommunity-aware <u>g</u>raph <u>l</u>earning <u>for g</u>eographical <u>p</u>oint <u>e</u>mbedding, naming: CGL4PE. By extensively analyzing nearby graph-modelled geospatial locations, our proposed CGL4PE model can efficiently capture the complex/multi-view structural latent representations of geographical locations. To do this, in our model we apply the integration between graph attention network (GAT) and graph isomorphism network (GIN) in the form of multi-view graph representation learning approach. Then, combining with location-based graph construction strategy, our CGL4PE model can directly improve the clustering performance of conventional spatial/density-based clustering methods. Different from previous location-based clustering techniques, our CGL4PE model is designed to better preserve the community-aware structural representations of constructed location-based graph. These community-aware structural representations are then utilized to extract more meaningful cluster information from geographical datasets, under the integrated graph representation learning and spatial density-based clustering approaches. The extensive empirical studies within real-world geographical datasets present the effectiveness of our proposed CGL4PE model in this paper.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"277 ","pages":"Article 127252"},"PeriodicalIF":7.5,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143642599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CLG: Automated checklist generation for improved pull request quality
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-03-14 DOI: 10.1016/j.eswa.2025.127178
Shuotong Bai, Chenkun Meng, Guodong Li, Huaxiao Liu, Lei Liu
{"title":"CLG: Automated checklist generation for improved pull request quality","authors":"Shuotong Bai,&nbsp;Chenkun Meng,&nbsp;Guodong Li,&nbsp;Huaxiao Liu,&nbsp;Lei Liu","doi":"10.1016/j.eswa.2025.127178","DOIUrl":"10.1016/j.eswa.2025.127178","url":null,"abstract":"<div><div>The Pull-Based development model, widely embraced in open-source software (OSS), leverages closer global collaboration. However, the increasing number of contributors introduces challenges, particularly in PR quality. Mature repositories address this by using checklists in Pull Request Templates (PRTs). Despite their benefits, a mere 14.15% of popular GitHub repositories implement such checklists. To address this gap, we propose CLG, a Check-List Generation approach, utilizing techniques with a multi-label classifier and summary generation to automatically generate checklists from contributing guidelines. Evaluation results demonstrate CLG’s superiority in each sub-task. As for categorize paragraphs of contributing guidelines, CLG improves 9.81%, 8.44%, 11.84%, and 10.17% across metrics of Accuracy, Precision, Recall, and F1-score. In the task of note generation, CLG improves 16.33%, 15.80%, and 6.43% in terms of ROUGE metrics. To investigate our generated checklists from actual perspectives, we submit our results to the open-source community. The results show that our approach can help the majority of repository maintainers and the open-source community in managing and submitting their received PRs.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"277 ","pages":"Article 127178"},"PeriodicalIF":7.5,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143643648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on speech synthesis technology based on Tibetan rhythmic features
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-03-14 DOI: 10.1016/j.eswa.2025.127181
Kuntharrgyal Khysru , Yangzom , Wenjie Tang , Jianguo Wei
{"title":"Research on speech synthesis technology based on Tibetan rhythmic features","authors":"Kuntharrgyal Khysru ,&nbsp;Yangzom ,&nbsp;Wenjie Tang ,&nbsp;Jianguo Wei","doi":"10.1016/j.eswa.2025.127181","DOIUrl":"10.1016/j.eswa.2025.127181","url":null,"abstract":"<div><div>Text-to-speech(TTS) synthesis technology is one of the core technologies in the field of human–computer interaction, playing an important role in this area. This article starts from the theory of Tibetan grammar and the phonetic characteristics of Tibetan language, and designs a rhythm boundary automatic annotation method for Tibetan text and acoustic features based on the phonetic characteristics of Tibetan language. By predicting the rhythm structure hierarchy through a rhythm prediction model, and after modeling Tibetan rhythm on the acoustic model, the Tibetan synthesis model based on the improved Tacotron2 is constructed to obtain the final synthetic Tibetan speech. Experimental results show that the deep model for Tibetan speech synthesis, which utilizes Tibetan rhythm features, can further improve the naturalness and comprehensibility of the synthesized speech.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"277 ","pages":"Article 127181"},"PeriodicalIF":7.5,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143642602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ODE-based generative modeling: Learning from a single natural image
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-03-14 DOI: 10.1016/j.eswa.2025.127185
Jian Yue , Yan Gan , Lihua Zhou , Yu Zhao , Shuaifeng Li , Mao Ye
{"title":"ODE-based generative modeling: Learning from a single natural image","authors":"Jian Yue ,&nbsp;Yan Gan ,&nbsp;Lihua Zhou ,&nbsp;Yu Zhao ,&nbsp;Shuaifeng Li ,&nbsp;Mao Ye","doi":"10.1016/j.eswa.2025.127185","DOIUrl":"10.1016/j.eswa.2025.127185","url":null,"abstract":"<div><div>Single image generation aims to learn the internal statistical distribution from a single natural image to generate diverse samples of arbitrary scales, serving as a tool for image manipulation tasks. Existing methods adopt the same pyramid structure for both training and multi-stage sampling to ensure the stability of the generation model. However, these methods result in a large number of sampling time steps and extra noise at each level of the pyramid to sample a single image. In this work, we propose a <strong>Sin</strong>gle image generative model based on Ordinary Differential Equation (<strong>ODE</strong>), dubbed as <strong>SinODE</strong>. Instead of relying on a repetitive multi-stage sampling process, SinODE reformulates single image sampling as a unified integration framework, reducing sampling times while eliminating unnecessary noise injection. To that end, we build straight paths connecting Gaussian noise to scaled images and generating samples with a multiple piece-wise integration mechanism. Furthermore, our method can employ external text to control the direction of generation, producing personalized new content or style without requiring model fine-tuning. SinODE can also be effortlessly applied to other image manipulation tasks, such as image style transfer and harmonization. Extensive experiments demonstrate that SinODE surpasses current state-of-the-art methods, producing high-quality samples with exceptional diversity.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"276 ","pages":"Article 127185"},"PeriodicalIF":7.5,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143636349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HiViT: Hierarchical attention-based Transformer for multi-scale whole slide histopathological image classification
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-03-14 DOI: 10.1016/j.eswa.2025.127164
Jinze Yu , Shuo Li , Luxin Tan , Haoyi Zhou , Zhongwu Li , Jianxin Li
{"title":"HiViT: Hierarchical attention-based Transformer for multi-scale whole slide histopathological image classification","authors":"Jinze Yu ,&nbsp;Shuo Li ,&nbsp;Luxin Tan ,&nbsp;Haoyi Zhou ,&nbsp;Zhongwu Li ,&nbsp;Jianxin Li","doi":"10.1016/j.eswa.2025.127164","DOIUrl":"10.1016/j.eswa.2025.127164","url":null,"abstract":"<div><div>The classification of Whole Slide Images (WSIs) remains a challenging task in computer-aided diagnostics. Though Multi-Instance Learning (MIL) and patch-wise modeling have become the mainstream methods in current research, they rely on a strong assumption that different patches are independent and identically distributed. Therefore, the contextual correlations within the images and across all the patches have been neglected, resulting in inferior performance, particularly in global-level tasks like mutation predictions. In this paper, we propose <strong>HiViT</strong>, a multi-scale WSI classification Transformer model utilizing the proposed Cross-scale Hierarchical Self-Attention (CHSA) mechanism to aggregate patch features among different scales and capture context information from the entire image with a reasonable memory cost. The CHSA mechanism is developed based on the multiple magnification scannings of WSIs and restricts cross-scale self-attention computation within local-constrained windows to aggregate patch features among different scales and capture context information from the entire image while getting rid of the infeasible memory cost of vanilla Transformers, and finally achieves the first fully trainable Transformer model for the WSI classification problem. Additionally, the Multi-Scale Feature Augmentation method is proposed to enhance permutation invariance on a large scale, which is important in the MIL assumption, while preserving local fine-grained contextual correlations. We extensively evaluated HiViT on several real-world datasets of diverse tasks, and our model outperformed the current state-of-the-art Transformer and MIL-based methods. Code will be available at <span><span>https://github.com/BUAA-SMART-Med-CV/HiViT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"277 ","pages":"Article 127164"},"PeriodicalIF":7.5,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143642588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sentiment analysis using osprey-optimized hybrid GoogleNet-ResNet with text and emoji data
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-03-14 DOI: 10.1016/j.eswa.2025.127165
Padigapati Anitha, A.V. Praveen Krishna
{"title":"Sentiment analysis using osprey-optimized hybrid GoogleNet-ResNet with text and emoji data","authors":"Padigapati Anitha,&nbsp;A.V. Praveen Krishna","doi":"10.1016/j.eswa.2025.127165","DOIUrl":"10.1016/j.eswa.2025.127165","url":null,"abstract":"<div><div>Sentimental analysis (SA) plays a crucial role in understanding consumer opinions from mobile product reviews. While various Deep learning (DL) approaches, such as deep neural networks (DNN) and convolutional neural networks (CNN), have been developed for SA, they often suffer from limitations like low accuracy, high processing time and reliance on single data modality (text data). Upon considering these issues, a novel hybrid DL model is introduced in this research by integrating both text and emoji data for enhanced sentiment classification. Unlike the existing models that primarily rely on text-based features, the proposed approach utilizes both text and emoji representations to capture richer sentiment context. Initially, data pre-processing and sentiment annotation are performed. Text data is transformed into numerical feature vectors using advanced feature extraction techniques, including Word2Vec and Emoji2Vec models, as well as Continuous Bag of Words (CBOW) and Skip-gram (SG). To improve classification accuracy, a depthwise separable hybrid GoogleNet with ResNet convolutional model is employed, leveraging the strengths of both architectures in feature extraction and representation learning. Furthermore, the classifier’s hyperparameters are fine-tuned using the osprey optimization mechanism, an advanced metaheuristic approach that enhances convergence speed and model performances. The proposed model is evaluated under the Flipkart product reviews sentiment dataset, demonstrating its efficacy through performance analysis. These findings confirm that incorporating multi-modal data (text and emoji) and leveraging optimized hybrid architectures significantly enhance sentiment analysis accuracy. From this analysis, the proposed model outperforms other classifiers. The experimental results prove that the proposed model achieves a high accuracy, precision, recall and f1-score of 98.3%, 96%, 97.6% and 96.82%, respectively.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"277 ","pages":"Article 127165"},"PeriodicalIF":7.5,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143642664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prediction of ailments using federated transfer learning and weight penalty-rational Tanh-RNN
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-03-14 DOI: 10.1016/j.eswa.2025.127253
C.K. Shahnazeer, G. Sureshkumar
{"title":"Prediction of ailments using federated transfer learning and weight penalty-rational Tanh-RNN","authors":"C.K. Shahnazeer,&nbsp;G. Sureshkumar","doi":"10.1016/j.eswa.2025.127253","DOIUrl":"10.1016/j.eswa.2025.127253","url":null,"abstract":"<div><div>The widespread adoption of technology has simplified the process of predicting ailments. Nevertheless, disease prediction for critical multi-organ systems was not done using the current approaches; rather, it was limited to single or dual organs. Therefore, utilizing WP-RT-RNN, a Federated Transfer Learning (FTL)-based disease prediction methodology is developed. First, the heart, lung, liver, and kidney dataset data are pre-processed, and then the Gaussian Mixture Model Hole Filling (GMMHF) technique is used for background subtraction. The image with the background removed is put into the Hurst operator, which extracts features. In the meantime, pre-processing is done on the EMR data before it is transmitted to the feature learning module based on Marginal Fisher Analysis-Convolution (MFA-CN), which effectively represents local and context features. Later that, the features gathered from the EMR data and images are combined and sent to the FSSTS-BCMO to choose the most important features. The Weight Penalty-Rational Tanh-Recurrent Neural Network (WP-RT-RNN) is then trained using the chosen features to effectively predict both normal and abnormal disease. At last, a performance comparison is carried out to confirm the suggested system’s efficacy.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"276 ","pages":"Article 127253"},"PeriodicalIF":7.5,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143636728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FewRelEx: Exploring multimodal few-shot relation extraction with enhanced visual–textual mapping
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-03-14 DOI: 10.1016/j.eswa.2025.127104
Keke Tian , Zhong Peng
{"title":"FewRelEx: Exploring multimodal few-shot relation extraction with enhanced visual–textual mapping","authors":"Keke Tian ,&nbsp;Zhong Peng","doi":"10.1016/j.eswa.2025.127104","DOIUrl":"10.1016/j.eswa.2025.127104","url":null,"abstract":"<div><div>Few-shot relation extraction is a fundamental process for building knowledge graphs. However, existing methods often experience a notable degradation in performance when dealing with concise and noisy textual data, mainly due to the scarcity of ample contextual information within these texts. To overcome this limitation, we have designed a new multimodal few-shot relation extraction method that uses visual information to assist in identifying entity relationships within the text. The FewRelEx model consists of two key components: a semantic feature extraction module and a graph structure alignment module. The semantic feature extraction module is responsible for extracting semantic information from both text and images, including the global features of the image and the local features of the objects within the image. The graph structure alignment module is responsible for mapping the visual relationships between local objects to the textual relationships between entities in the sentence. We conducted in-depth experiments on two public datasets, and the results show that by introducing visual information, the FewRelEx model significantly improves the accuracy of relationship prediction in few-shot scenarios, effectively complementing the deficiencies in textual information.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"277 ","pages":"Article 127104"},"PeriodicalIF":7.5,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143681714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Blended-emotional speech for Speaker Recognition by using the fusion of Mel-CQT spectrograms feature extraction
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-03-14 DOI: 10.1016/j.eswa.2025.127184
Shalini Tomar, Shashidhar G. Koolagudi
{"title":"Blended-emotional speech for Speaker Recognition by using the fusion of Mel-CQT spectrograms feature extraction","authors":"Shalini Tomar,&nbsp;Shashidhar G. Koolagudi","doi":"10.1016/j.eswa.2025.127184","DOIUrl":"10.1016/j.eswa.2025.127184","url":null,"abstract":"<div><div>Emotions are integral to human speech, adding depth and influencing the effectiveness of interactions. Speech with a single emotion is speech in which the emotional state stays the same throughout the utterance. Unlike single emotion, blended emotion involves a mix of emotions, such as happiness tinged with sadness or a shift from neutral to sadness within the same utterance. In real-life scenarios, people often experience and express mixed emotions. Most existing works on Speaker Recognition (SR), which recognizes the person from their voice, have focused on either neutral emotions or some primary emotions. This study aims to develop Blended-Emotional Speaker Recognition (BESR). In the proposed work, we try to look for emotional information in speech signals by simulating a blended emotional speech dataset for Speaker Recognition. The fusion of the Mel-Spectrograms and the Constant-Q Transform Spectrograms (Mel-CQT Spectrograms) has been developed to extract features. Three datasets, namely the National Institute of Technology Karnataka Kannada Language Emotional Speech Corpus (NITK-KLESC), the Crowd-sourced emotional multimodal actors dataset (CREMA-D), and the Indian Institute of Technology Kharagpur Simulated Emotion Hindi Speech Corpus (IITKGP-SEHSC) datasets are considered for the proposed work. The experimental outcomes demonstrate that the performance of the BESR system using blended emotional speech improves the fairness of Speaker Recognition.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"276 ","pages":"Article 127184"},"PeriodicalIF":7.5,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143636731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信