IEEE/ACM Transactions on Audio, Speech, and Language Processing最新文献

筛选
英文 中文
Enhancing Robustness of Speech Watermarking Using a Transformer-Based Framework Exploiting Acoustic Features 利用基于变压器的声学特征框架增强语音水印的鲁棒性
IF 4.1 2区 计算机科学
IEEE/ACM Transactions on Audio, Speech, and Language Processing Pub Date : 2024-11-08 DOI: 10.1109/TASLP.2024.3486206
Chuxuan Tong;Iynkaran Natgunanathan;Yong Xiang;Jianhua Li;Tianrui Zong;Xi Zheng;Longxiang Gao
{"title":"Enhancing Robustness of Speech Watermarking Using a Transformer-Based Framework Exploiting Acoustic Features","authors":"Chuxuan Tong;Iynkaran Natgunanathan;Yong Xiang;Jianhua Li;Tianrui Zong;Xi Zheng;Longxiang Gao","doi":"10.1109/TASLP.2024.3486206","DOIUrl":"https://doi.org/10.1109/TASLP.2024.3486206","url":null,"abstract":"Digital watermarking serves as an effective approach for safeguarding speech signal copyrights, achieved by the incorporation of ownership information into the original signal and its subsequent extraction from the watermarked signal. While traditional watermarking methods can embed and extract watermarks successfully when the watermarked signals are not exposed to severe alterations, these methods cannot withstand attacks such as de-synchronization. In this work, we introduce a novel transformer-based framework designed to enhance the imperceptibility and robustness of speech watermarking. This framework incorporates encoders and decoders built on multi-scale transformer blocks to effectively capture local and long-range features from inputs, such as acoustic features extracted by Short-Time Fourier Transformation (STFT). Further, a deep neural networks (DNNs) based generator, notably the Transformer architecture, is employed to adaptively embed imperceptible watermarks. These perturbations serve as a step for simulating noise, thereby bolstering the watermark robustness during the training phase. Experimental results show the superiority of our proposed framework in terms of watermark imperceptibility and robustness against various watermark attacks. When compared to the currently available related techniques, the framework exhibits an eightfold increase in embedding rate. Further, it also presents superior practicality with scalability and reduced inference time of DNN models.","PeriodicalId":13332,"journal":{"name":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","volume":"32 ","pages":"4822-4837"},"PeriodicalIF":4.1,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142645535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MRC-PASCL: A Few-Shot Machine Reading Comprehension Approach via Post-Training and Answer Span-Oriented Contrastive Learning MRC-PASCL:通过后训练和以答案跨度为导向的对比学习实现快速机器阅读理解的方法
IF 4.1 2区 计算机科学
IEEE/ACM Transactions on Audio, Speech, and Language Processing Pub Date : 2024-10-31 DOI: 10.1109/TASLP.2024.3490373
Ren Li;Qiao Xiao;Jianxi Yang;Luyi Zhang;Yu Chen
{"title":"MRC-PASCL: A Few-Shot Machine Reading Comprehension Approach via Post-Training and Answer Span-Oriented Contrastive Learning","authors":"Ren Li;Qiao Xiao;Jianxi Yang;Luyi Zhang;Yu Chen","doi":"10.1109/TASLP.2024.3490373","DOIUrl":"https://doi.org/10.1109/TASLP.2024.3490373","url":null,"abstract":"The rapid development of pre-trained language models (PLMs) has significantly enhanced the performance of machine reading comprehension (MRC). Nevertheless, the traditional fine-tuning approaches necessitate extensive labeled data. MRC remains a challenging task in the few-shot settings or low-resource scenarios. This study proposes a novel few-shot MRC approach via post-training and answer span-oriented contrastive learning, termed MRC-PASCL. Specifically, in the post-training module, a novel noun-entity-aware data selection and generation strategy is proposed according to characteristics of MRC task and data, focusing on masking nouns and named entities in the context. In terms of fine-tuning, the proposed answer span-oriented contrastive learning manner selects spans around the golden answers as negative examples, and performs multi-task learning together with the standard MRC answer prediction task. Experimental results show that MRC-PASCL outperforms the PLMs-based baseline models and the 7B and 13B large language models (LLMs) cross most MRQA 2019 datasets. Further analyses show that our approach achieves better inference efficiency with lower computational resource requirement. The analysis results also indicate that the proposed method can better adapt to the domain-specific scenarios.","PeriodicalId":13332,"journal":{"name":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","volume":"32 ","pages":"4838-4849"},"PeriodicalIF":4.1,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142645505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信