Journal on Audio Speech and Music Processing最新文献

筛选
英文 中文
Microphone utility estimation in acoustic sensor networks using single-channel signal features 基于单通道信号特征的声传感器网络麦克风效用估计
IF 2.4 3区 计算机科学
Journal on Audio Speech and Music Processing Pub Date : 2022-01-24 DOI: 10.1186/s13636-023-00294-7
M. Gunther, Andreas Brendel, Walter Kellermann
{"title":"Microphone utility estimation in acoustic sensor networks using single-channel signal features","authors":"M. Gunther, Andreas Brendel, Walter Kellermann","doi":"10.1186/s13636-023-00294-7","DOIUrl":"https://doi.org/10.1186/s13636-023-00294-7","url":null,"abstract":"","PeriodicalId":49309,"journal":{"name":"Journal on Audio Speech and Music Processing","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2022-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46826683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Improving low-resource Tibetan end-to-end ASR by multilingual and multilevel unit modeling 基于多语言、多层次单元建模的低资源藏文端到端ASR改进
IF 2.4 3区 计算机科学
Journal on Audio Speech and Music Processing Pub Date : 2022-01-12 DOI: 10.1186/s13636-021-00233-4
Siqing Qin, Longbiao Wang, Sheng Li, J. Dang, Lixin Pan
{"title":"Improving low-resource Tibetan end-to-end ASR by multilingual and multilevel unit modeling","authors":"Siqing Qin, Longbiao Wang, Sheng Li, J. Dang, Lixin Pan","doi":"10.1186/s13636-021-00233-4","DOIUrl":"https://doi.org/10.1186/s13636-021-00233-4","url":null,"abstract":"","PeriodicalId":49309,"journal":{"name":"Journal on Audio Speech and Music Processing","volume":"2022 1","pages":"1-10"},"PeriodicalIF":2.4,"publicationDate":"2022-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43185179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Auxiliary function-based algorithm for blind extraction of a moving speaker 基于辅助函数的运动说话人盲提取算法
IF 2.4 3区 计算机科学
Journal on Audio Speech and Music Processing Pub Date : 2022-01-04 DOI: 10.1186/s13636-021-00231-6
Jakub Janský, Zbyněk Koldovský, J. Málek, Tomás Kounovský, Jaroslav Cmejla
{"title":"Auxiliary function-based algorithm for blind extraction of a moving speaker","authors":"Jakub Janský, Zbyněk Koldovský, J. Málek, Tomás Kounovský, Jaroslav Cmejla","doi":"10.1186/s13636-021-00231-6","DOIUrl":"https://doi.org/10.1186/s13636-021-00231-6","url":null,"abstract":"","PeriodicalId":49309,"journal":{"name":"Journal on Audio Speech and Music Processing","volume":"2022 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2022-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65688142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
On the selection of the number of beamformers in beamforming-based binaural reproduction. 基于波束成形的双耳再现中波束成形器数量的选择。
IF 2.4 3区 计算机科学
Journal on Audio Speech and Music Processing Pub Date : 2022-01-01 Epub Date: 2022-03-30 DOI: 10.1186/s13636-022-00238-7
Itay Ifergan, Boaz Rafaely
{"title":"On the selection of the number of beamformers in beamforming-based binaural reproduction.","authors":"Itay Ifergan, Boaz Rafaely","doi":"10.1186/s13636-022-00238-7","DOIUrl":"10.1186/s13636-022-00238-7","url":null,"abstract":"<p><p>In recent years, spatial audio reproduction has been widely researched with many studies focusing on headphone-based spatial reproduction. A popular format for spatial audio is higher order Ambisonics (HOA), where a spherical microphone array is typically used to obtain the HOA signals. When a spherical array is not available, beamforming-based binaural reproduction (BFBR) can be used, where signals are captured with arrays of a general configuration. While shown to be useful, no comprehensive studies of BFBR have been presented and so its limitations and other design aspects are not well understood. This paper takes an initial step towards developing a theory for BFBR and develops guidelines for selecting the number of beamformers. In particular, the <i>average directivity factor</i> of the microphone array is proposed as a measure for supporting this selection. The effect of head-related transfer function (HRTF) order truncation that occurs when using too many beamformer directions is presented and studied. In addition, the relation between HOA-based binaural reproduction and BFBR is discussed through analysis based on a spherical array. A simulation study is then presented, based on both a spherical and a planar array, demonstrating the proposed guidelines. A listening test verifies the perceptual attributes of the methods presented in this study. These results can be used for more informed beamformer design for BFBR.</p>","PeriodicalId":49309,"journal":{"name":"Journal on Audio Speech and Music Processing","volume":"2022 1","pages":"6"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8965231/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65688237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Text-to-speech system for low-resource language using cross-lingual transfer learning and data augmentation 使用跨语言迁移学习和数据增强的低资源语言文本到语音系统
IF 2.4 3区 计算机科学
Journal on Audio Speech and Music Processing Pub Date : 2021-12-01 DOI: 10.1186/s13636-021-00225-4
Zolzaya Byambadorj, Ryota Nishimura, Altangerel Ayush, Kengo Ohta, N. Kitaoka
{"title":"Text-to-speech system for low-resource language using cross-lingual transfer learning and data augmentation","authors":"Zolzaya Byambadorj, Ryota Nishimura, Altangerel Ayush, Kengo Ohta, N. Kitaoka","doi":"10.1186/s13636-021-00225-4","DOIUrl":"https://doi.org/10.1186/s13636-021-00225-4","url":null,"abstract":"","PeriodicalId":49309,"journal":{"name":"Journal on Audio Speech and Music Processing","volume":"2021 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65687751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit 通过RawNet-SA和门控循环单元在直播中锚定声纹识别
IF 2.4 3区 计算机科学
Journal on Audio Speech and Music Processing Pub Date : 2021-12-01 DOI: 10.1186/s13636-021-00234-3
Jiacheng Yao, J. Zhang, Jiafeng Li, L. Zhuo
{"title":"Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit","authors":"Jiacheng Yao, J. Zhang, Jiafeng Li, L. Zhuo","doi":"10.1186/s13636-021-00234-3","DOIUrl":"https://doi.org/10.1186/s13636-021-00234-3","url":null,"abstract":"","PeriodicalId":49309,"journal":{"name":"Journal on Audio Speech and Music Processing","volume":"2021 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65688213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spherical harmonic covariance and magnitude function encodings for beamformer design 波束形成器设计中的球谐协方差和幅度函数编码
IF 2.4 3区 计算机科学
Journal on Audio Speech and Music Processing Pub Date : 2021-12-01 DOI: 10.1186/s13636-021-00230-7
Yuancheng Luo
{"title":"Spherical harmonic covariance and magnitude function encodings for beamformer design","authors":"Yuancheng Luo","doi":"10.1186/s13636-021-00230-7","DOIUrl":"https://doi.org/10.1186/s13636-021-00230-7","url":null,"abstract":"","PeriodicalId":49309,"journal":{"name":"Journal on Audio Speech and Music Processing","volume":"2021 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65688119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
U2-VC: one-shot voice conversion using two-level nested U-structure U2-VC:一次语音转换,采用两级嵌套u型结构
IF 2.4 3区 计算机科学
Journal on Audio Speech and Music Processing Pub Date : 2021-11-24 DOI: 10.1186/s13636-021-00226-3
Fangkun Liu, Hui Wang, Renhua Peng, C. Zheng, Xiaodong Li
{"title":"U2-VC: one-shot voice conversion using two-level nested U-structure","authors":"Fangkun Liu, Hui Wang, Renhua Peng, C. Zheng, Xiaodong Li","doi":"10.1186/s13636-021-00226-3","DOIUrl":"https://doi.org/10.1186/s13636-021-00226-3","url":null,"abstract":"","PeriodicalId":49309,"journal":{"name":"Journal on Audio Speech and Music Processing","volume":" ","pages":"1-15"},"PeriodicalIF":2.4,"publicationDate":"2021-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47351244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
dEchorate: a calibrated room impulse response dataset for echo-aware signal processing dEchorate:用于回声感知信号处理的校准房间脉冲响应数据集
IF 2.4 3区 计算机科学
Journal on Audio Speech and Music Processing Pub Date : 2021-11-23 DOI: 10.1186/s13636-021-00229-0
D. Carlo, Pinchas Tandeitnik, C. Foy, N. Bertin, Antoine Deleforge, S. Gannot
{"title":"dEchorate: a calibrated room impulse response dataset for echo-aware signal processing","authors":"D. Carlo, Pinchas Tandeitnik, C. Foy, N. Bertin, Antoine Deleforge, S. Gannot","doi":"10.1186/s13636-021-00229-0","DOIUrl":"https://doi.org/10.1186/s13636-021-00229-0","url":null,"abstract":"","PeriodicalId":49309,"journal":{"name":"Journal on Audio Speech and Music Processing","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2021-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49413624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Robust single- and multi-loudspeaker least-squares-based equalization for hearing devices 健壮的单扬声器和多扬声器基于最小二乘的听力设备均衡
IF 2.4 3区 计算机科学
Journal on Audio Speech and Music Processing Pub Date : 2021-09-09 DOI: 10.1186/s13636-022-00247-6
H. Schepker, Florian Denk, B. Kollmeier, S. Doclo
{"title":"Robust single- and multi-loudspeaker least-squares-based equalization for hearing devices","authors":"H. Schepker, Florian Denk, B. Kollmeier, S. Doclo","doi":"10.1186/s13636-022-00247-6","DOIUrl":"https://doi.org/10.1186/s13636-022-00247-6","url":null,"abstract":"","PeriodicalId":49309,"journal":{"name":"Journal on Audio Speech and Music Processing","volume":"2022 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2021-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41772564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信