语音信号中音调估计的字典学习

F. Huang, P. Balázs
{"title":"语音信号中音调估计的字典学习","authors":"F. Huang, P. Balázs","doi":"10.1109/MLSP.2017.8168173","DOIUrl":null,"url":null,"abstract":"This paper presents an automatic approach for parameter training for a sparsity-based pitch estimation method that has been previously published. For this pitch estimation method, the harmonic dictionary is a key parameter that needs to be carefully prepared beforehand. In the original method, extensive human supervision and involvement are required to construct and label the dictionary. In this study, we propose to employ dictionary learning algorithms to learn the dictionary directly from training data. We apply and compare 3 typical dictionary learning algorithms, i.e., the method of optimized directions (MOD), K-SVD and online dictionary learning (ODL), and propose a post-processing method to label and adapt a learned dictionary for pitch estimation. Results show that MOD and properly initialized ODL (pi-ODL) can lead to dictionaries that exhibit the desired harmonic structures for pitch estimation, and the post-processing method can significantly improve performance of the learned dictionaries in pitch estimation. The dictionary obtained with pi-ODL and post-processing attained pitch estimation accuracy close to the optimal performance of the manual dictionary. It is positively shown that dictionary learning is feasible and promising for this application.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"49 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Dictionary learning for pitch estimation in speech signals\",\"authors\":\"F. Huang, P. Balázs\",\"doi\":\"10.1109/MLSP.2017.8168173\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents an automatic approach for parameter training for a sparsity-based pitch estimation method that has been previously published. For this pitch estimation method, the harmonic dictionary is a key parameter that needs to be carefully prepared beforehand. In the original method, extensive human supervision and involvement are required to construct and label the dictionary. In this study, we propose to employ dictionary learning algorithms to learn the dictionary directly from training data. We apply and compare 3 typical dictionary learning algorithms, i.e., the method of optimized directions (MOD), K-SVD and online dictionary learning (ODL), and propose a post-processing method to label and adapt a learned dictionary for pitch estimation. Results show that MOD and properly initialized ODL (pi-ODL) can lead to dictionaries that exhibit the desired harmonic structures for pitch estimation, and the post-processing method can significantly improve performance of the learned dictionaries in pitch estimation. The dictionary obtained with pi-ODL and post-processing attained pitch estimation accuracy close to the optimal performance of the manual dictionary. It is positively shown that dictionary learning is feasible and promising for this application.\",\"PeriodicalId\":6542,\"journal\":{\"name\":\"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)\",\"volume\":\"49 1\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MLSP.2017.8168173\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MLSP.2017.8168173","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

本文提出了一种基于稀疏性的基音估计方法的参数自动训练方法。对于这种基音估计方法,谐波字典是一个需要事先精心准备的关键参数。在最初的方法中,需要大量的人工监督和参与来构建和标记词典。在本研究中,我们提出使用字典学习算法直接从训练数据中学习字典。我们应用并比较了优化方向法(MOD)、K-SVD和在线字典学习(ODL) 3种典型的字典学习算法,并提出了一种后处理方法来标记和调整学习到的字典用于音高估计。结果表明,MOD和适当初始化的ODL (pi-ODL)可以得到具有所需谐波结构的字典,并且后处理方法可以显著提高学习到的字典在基音估计中的性能。使用pi-ODL和后处理获得的字典获得的基音估计精度接近手动字典的最佳性能。结果表明,字典学习在这一应用中是可行的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Dictionary learning for pitch estimation in speech signals
This paper presents an automatic approach for parameter training for a sparsity-based pitch estimation method that has been previously published. For this pitch estimation method, the harmonic dictionary is a key parameter that needs to be carefully prepared beforehand. In the original method, extensive human supervision and involvement are required to construct and label the dictionary. In this study, we propose to employ dictionary learning algorithms to learn the dictionary directly from training data. We apply and compare 3 typical dictionary learning algorithms, i.e., the method of optimized directions (MOD), K-SVD and online dictionary learning (ODL), and propose a post-processing method to label and adapt a learned dictionary for pitch estimation. Results show that MOD and properly initialized ODL (pi-ODL) can lead to dictionaries that exhibit the desired harmonic structures for pitch estimation, and the post-processing method can significantly improve performance of the learned dictionaries in pitch estimation. The dictionary obtained with pi-ODL and post-processing attained pitch estimation accuracy close to the optimal performance of the manual dictionary. It is positively shown that dictionary learning is feasible and promising for this application.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信