{"title":"Cross-lingual speaker transfer for Cambodian based on feature disentangler and time-frequency attention adaptive normalization","authors":"Yuanzhang Yang, Linqin Wang, Shengxiang Gao, Zhengtao Yu, Ling Dong","doi":"10.1108/ijwis-09-2023-0162","DOIUrl":null,"url":null,"abstract":"\nPurpose\nThis paper aims to disentangle Chinese-English-rich resources linguistic and speaker timbre features, achieving cross-lingual speaker transfer for Cambodian.\n\n\nDesign/methodology/approach\nThis study introduces a novel approach: the construction of a cross-lingual feature disentangler coupled with the integration of time-frequency attention adaptive normalization to proficiently convert Cambodian speaker timbre into Chinese-English without altering the underlying Cambodian speech content.\n\n\nFindings\nConsidering the limited availability of multi-speaker corpora in Cambodia, conventional methods have demonstrated subpar performance in Cambodian speaker voice transfer.\n\n\nOriginality/value\nThe originality of this study lies in the effectiveness of the disentanglement process and precise control over speaker timbre feature transfer.\n","PeriodicalId":44153,"journal":{"name":"International Journal of Web Information Systems","volume":null,"pages":null},"PeriodicalIF":2.5000,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Web Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/ijwis-09-2023-0162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose
This paper aims to disentangle Chinese-English-rich resources linguistic and speaker timbre features, achieving cross-lingual speaker transfer for Cambodian.
Design/methodology/approach
This study introduces a novel approach: the construction of a cross-lingual feature disentangler coupled with the integration of time-frequency attention adaptive normalization to proficiently convert Cambodian speaker timbre into Chinese-English without altering the underlying Cambodian speech content.
Findings
Considering the limited availability of multi-speaker corpora in Cambodia, conventional methods have demonstrated subpar performance in Cambodian speaker voice transfer.
Originality/value
The originality of this study lies in the effectiveness of the disentanglement process and precise control over speaker timbre feature transfer.
期刊介绍:
The Global Information Infrastructure is a daily reality. In spite of the many applications in all domains of our societies: e-business, e-commerce, e-learning, e-science, and e-government, for instance, and in spite of the tremendous advances by engineers and scientists, the seamless development of Web information systems and services remains a major challenge. The journal examines how current shared vision for the future is one of semantically-rich information and service oriented architecture for global information systems. This vision is at the convergence of progress in technologies such as XML, Web services, RDF, OWL, of multimedia, multimodal, and multilingual information retrieval, and of distributed, mobile and ubiquitous computing. Topicality While the International Journal of Web Information Systems covers a broad range of topics, the journal welcomes papers that provide a perspective on all aspects of Web information systems: Web semantics and Web dynamics, Web mining and searching, Web databases and Web data integration, Web-based commerce and e-business, Web collaboration and distributed computing, Internet computing and networks, performance of Web applications, and Web multimedia services and Web-based education.