探索在语音处理中增强音高估计的混合技术

Q3 Engineering
S. K. B. Sangeetha, K. Chandran, S. Mathivanan, Hariharan Rajadurai, Basu Dev Shivahare
{"title":"探索在语音处理中增强音高估计的混合技术","authors":"S. K. B. Sangeetha, K. Chandran, S. Mathivanan, Hariharan Rajadurai, Basu Dev Shivahare","doi":"10.2174/0118722121312618240612093010","DOIUrl":null,"url":null,"abstract":"\n\n1. To develop a hybrid approach combining the Pitch Estimation Filter (PEF) and Cepstrum Pitch Determination (CPD) methods for pitch detection in audio signals.\n2. To conduct comparative analysis with existing pitch detection methodologies, including Normalized Correlation Function (NCF), Pitch Estimation Filter (PEF), Log-Harmonic Summation (LHS), Summation of Residual Harmonics (SRH) and Cepstrum Pitch Determination (CEP), to assess the performance and accuracy of the proposed hybrid approach.\n3. To evaluate the effectiveness of the hybrid approach in various real-world applications such as speech recognition and music transcription, using performance metrics including Gross Pitch Error (GPE) and classification accuracy through a K-Nearest Neighbors (KNN) classifier.\n\n\n\nThe study discussed the difficulties in assessing pitch detection algorithms in real-world applications, especially when it comes to audio synthesis and music production. Prominent performance metrics and criteria pertinent to pitch tracking in interactive music applications were identified by the authors through comprehensive user studies and surveys with audio engineers and professional musicians. The results demonstrated the need for user-centered design approaches in algorithm development and evaluation by emphasizing the significance of taking user preferences and practical requirements into account when evaluating the effectiveness of pitch detection algorithms.\n\n\n\n1. To develop a hybrid approach combining the Pitch Estimation Filter (PEF) and Cepstrum Pitch Determination (CPD) methods for pitch detection in audio signals.\n2. To conduct comparative analysis with existing pitch detection methodologies, including Normalized Correlation Function (NCF), Pitch Estimation Filter (PEF), Log-Harmonic Summation (LHS), Summation of Residual Harmonics (SRH) and Cepstrum Pitch Determination (CEP), to assess the performance and accuracy of the proposed hybrid approach.\n3. To evaluate the effectiveness of the hybrid approach in various real-world applications such as speech recognition and music transcription, using performance metrics including Gross Pitch Error (GPE) and classification accuracy through a K-Nearest Neighbors (KNN) classifier.\n\n\n\nProposed PEF+CEP\n\n\n\nFinally, a comparison and analysis of different pitch detection techniques revealed how well they performed in terms of important evaluation metrics like accuracy, specificity, sensitivity, and gross pitch error (GPE). Conventional methods such as Normalized Correlation Function (NCF), Pitch Estimation Filter (PEF), Log-Harmonic Summation (LHS), Summation of Residual Harmonics(SRH) and Cepstrum Pitch Determination (CEP) perform admirably in terms of specificity and accuracy, but they are not very effective in terms of sensitivity and GPE. On the other hand, the suggested hybrid approach, Proposed PEF+CEP, offers a noteworthy enhancement in accuracy, attaining a remarkable 98.8%, in addition to a sensitivity of 99.2%. The hybrid approach exhibits a slightly higher GPE than some traditional methods, but these minor deviations are outweighed by the significant improvements in accuracy and sensitivity that it offers. Furthermore, the Proposed PEF+CEP method is a promising solution for reliable and accurate pitch detection in speech processing applications because it strikes a strong balance between computational efficiency, training time, model size, and convergence rate. The suggested method offers notable improvements in pitch detection accuracy and reliability while addressing the drawbacks of separate approaches by utilizing the advantages of both PEF and CEP techniques. As a result, the suggested PEF+CEP approach stands out as a significant advancement in speech processing, offering enhanced functionality and versatility in a range of real-world settings.\n\n\n\nFinally, a comparison and analysis of different pitch detection techniques revealed how well they performed in terms of important evaluation metrics like accuracy, specificity, sensitivity, and gross pitch error (GPE). Conventional methods such as Normalized Correlation Function (NCF), Pitch Estimation Filter (PEF), Log-Harmonic Summation (LHS), Summation of Residual Harmonics(SRH) and Cepstrum Pitch Determination (CEP) perform admirably in terms of specificity and accuracy, but they are not very effective in terms of sensitivity and GPE. On the other hand, the suggested hybrid approach, Proposed PEF+CEP, offers a noteworthy enhancement in accuracy, attaining a remarkable 98.8%, in addition to a sensitivity of 99.2%. The hybrid approach exhibits a slightly higher GPE than some traditional methods, but these minor deviations are outweighed by the significant improvements in accuracy and sensitivity that it offers. Furthermore, the Proposed PEF+CEP method is a promising solution for reliable and accurate pitch detection in speech processing applications because it strikes a strong balance between computational efficiency, training time, model size, and convergence rate. The suggested method offers notable improvements in pitch detection accuracy and reliability while addressing the drawbacks of separate approaches by utilizing the advantages of both PEF and CEP techniques. As a result, the suggested PEF+CEP approach stands out as a significant advancement in speech processing, offering enhanced functionality and versatility in a range of real-world settings. Pitch detection algorithms could become even more complex and effective with more research and development in this area, enabling improvements in text-to-speech synthesis, speaker Identification, And Speech Recognition, Among Other Fields.\n\n\n\nNil\n","PeriodicalId":40022,"journal":{"name":"Recent Patents on Engineering","volume":"6 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring Hybrid Techniques for Enhanced Pitch Estimation in Speech\\nProcessing\",\"authors\":\"S. K. B. Sangeetha, K. Chandran, S. Mathivanan, Hariharan Rajadurai, Basu Dev Shivahare\",\"doi\":\"10.2174/0118722121312618240612093010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n\\n1. To develop a hybrid approach combining the Pitch Estimation Filter (PEF) and Cepstrum Pitch Determination (CPD) methods for pitch detection in audio signals.\\n2. To conduct comparative analysis with existing pitch detection methodologies, including Normalized Correlation Function (NCF), Pitch Estimation Filter (PEF), Log-Harmonic Summation (LHS), Summation of Residual Harmonics (SRH) and Cepstrum Pitch Determination (CEP), to assess the performance and accuracy of the proposed hybrid approach.\\n3. To evaluate the effectiveness of the hybrid approach in various real-world applications such as speech recognition and music transcription, using performance metrics including Gross Pitch Error (GPE) and classification accuracy through a K-Nearest Neighbors (KNN) classifier.\\n\\n\\n\\nThe study discussed the difficulties in assessing pitch detection algorithms in real-world applications, especially when it comes to audio synthesis and music production. Prominent performance metrics and criteria pertinent to pitch tracking in interactive music applications were identified by the authors through comprehensive user studies and surveys with audio engineers and professional musicians. The results demonstrated the need for user-centered design approaches in algorithm development and evaluation by emphasizing the significance of taking user preferences and practical requirements into account when evaluating the effectiveness of pitch detection algorithms.\\n\\n\\n\\n1. To develop a hybrid approach combining the Pitch Estimation Filter (PEF) and Cepstrum Pitch Determination (CPD) methods for pitch detection in audio signals.\\n2. To conduct comparative analysis with existing pitch detection methodologies, including Normalized Correlation Function (NCF), Pitch Estimation Filter (PEF), Log-Harmonic Summation (LHS), Summation of Residual Harmonics (SRH) and Cepstrum Pitch Determination (CEP), to assess the performance and accuracy of the proposed hybrid approach.\\n3. To evaluate the effectiveness of the hybrid approach in various real-world applications such as speech recognition and music transcription, using performance metrics including Gross Pitch Error (GPE) and classification accuracy through a K-Nearest Neighbors (KNN) classifier.\\n\\n\\n\\nProposed PEF+CEP\\n\\n\\n\\nFinally, a comparison and analysis of different pitch detection techniques revealed how well they performed in terms of important evaluation metrics like accuracy, specificity, sensitivity, and gross pitch error (GPE). Conventional methods such as Normalized Correlation Function (NCF), Pitch Estimation Filter (PEF), Log-Harmonic Summation (LHS), Summation of Residual Harmonics(SRH) and Cepstrum Pitch Determination (CEP) perform admirably in terms of specificity and accuracy, but they are not very effective in terms of sensitivity and GPE. On the other hand, the suggested hybrid approach, Proposed PEF+CEP, offers a noteworthy enhancement in accuracy, attaining a remarkable 98.8%, in addition to a sensitivity of 99.2%. The hybrid approach exhibits a slightly higher GPE than some traditional methods, but these minor deviations are outweighed by the significant improvements in accuracy and sensitivity that it offers. Furthermore, the Proposed PEF+CEP method is a promising solution for reliable and accurate pitch detection in speech processing applications because it strikes a strong balance between computational efficiency, training time, model size, and convergence rate. The suggested method offers notable improvements in pitch detection accuracy and reliability while addressing the drawbacks of separate approaches by utilizing the advantages of both PEF and CEP techniques. As a result, the suggested PEF+CEP approach stands out as a significant advancement in speech processing, offering enhanced functionality and versatility in a range of real-world settings.\\n\\n\\n\\nFinally, a comparison and analysis of different pitch detection techniques revealed how well they performed in terms of important evaluation metrics like accuracy, specificity, sensitivity, and gross pitch error (GPE). Conventional methods such as Normalized Correlation Function (NCF), Pitch Estimation Filter (PEF), Log-Harmonic Summation (LHS), Summation of Residual Harmonics(SRH) and Cepstrum Pitch Determination (CEP) perform admirably in terms of specificity and accuracy, but they are not very effective in terms of sensitivity and GPE. On the other hand, the suggested hybrid approach, Proposed PEF+CEP, offers a noteworthy enhancement in accuracy, attaining a remarkable 98.8%, in addition to a sensitivity of 99.2%. The hybrid approach exhibits a slightly higher GPE than some traditional methods, but these minor deviations are outweighed by the significant improvements in accuracy and sensitivity that it offers. Furthermore, the Proposed PEF+CEP method is a promising solution for reliable and accurate pitch detection in speech processing applications because it strikes a strong balance between computational efficiency, training time, model size, and convergence rate. The suggested method offers notable improvements in pitch detection accuracy and reliability while addressing the drawbacks of separate approaches by utilizing the advantages of both PEF and CEP techniques. As a result, the suggested PEF+CEP approach stands out as a significant advancement in speech processing, offering enhanced functionality and versatility in a range of real-world settings. Pitch detection algorithms could become even more complex and effective with more research and development in this area, enabling improvements in text-to-speech synthesis, speaker Identification, And Speech Recognition, Among Other Fields.\\n\\n\\n\\nNil\\n\",\"PeriodicalId\":40022,\"journal\":{\"name\":\"Recent Patents on Engineering\",\"volume\":\"6 5\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Recent Patents on Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2174/0118722121312618240612093010\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Engineering\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Recent Patents on Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/0118722121312618240612093010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 0

摘要

1.2. 与现有的音高检测方法进行比较分析,包括归一化相关函数(NCF)、音高估计滤波器(PEF)、对数谐波求和(LHS)、残余谐波求和(SRH)和倒频谱音高确定(CEP),以评估所提出的混合方法的性能和准确性。该研究讨论了在实际应用中评估音高检测算法的困难,尤其是在音频合成和音乐制作方面。作者通过对音频工程师和专业音乐家进行全面的用户研究和调查,确定了与交互式音乐应用中音高跟踪相关的主要性能指标和标准。研究结果表明,在算法开发和评估中需要采用以用户为中心的设计方法,强调了在评估音高检测算法的有效性时考虑用户偏好和实际要求的重要性。 开发一种结合音高估计滤波器(PEF)和倒频谱音高确定(CPD)方法的混合方法,用于音频信号中的音高检测。与现有的音高检测方法进行比较分析,包括归一化相关函数(NCF)、音高估计滤波器(PEF)、对数谐波求和(LHS)、残余谐波求和(SRH)和倒频谱音高确定(CEP),以评估所提出的混合方法的性能和准确性。最后,对不同音高检测技术的比较和分析表明了它们在准确度、特异性、灵敏度和总音高误差 (GPE) 等重要评估指标方面的表现。归一化相关函数 (NCF)、音高估计滤波器 (PEF)、对数谐波求和 (LHS)、残余谐波求和 (SRH) 和倒频谱音高确定 (CEP) 等传统方法在特异性和准确性方面表现出色,但在灵敏度和总音高误差方面效果不佳。另一方面,建议的 PEF+CEP 混合方法显著提高了准确度,达到了 98.8%,灵敏度也达到了 99.2%。该混合方法的 GPE 略高于某些传统方法,但其在准确度和灵敏度方面的显著提高抵消了这些微小的偏差。此外,拟议的 PEF+CEP 方法在计算效率、训练时间、模型大小和收敛速度之间取得了很好的平衡,因此是语音处理应用中可靠、准确的音高检测的一种有前途的解决方案。所建议的方法通过利用 PEF 和 CEP 技术的优势,解决了单独方法的缺点,从而显著提高了音高检测的准确性和可靠性。最后,对不同音高检测技术的比较和分析显示了它们在准确性、特异性、灵敏度和总音高误差(GPE)等重要评估指标方面的表现。归一化相关函数 (NCF)、音高估计滤波器 (PEF)、对数谐波求和 (LHS)、残余谐波求和 (SRH) 和倒频谱音高确定 (CEP) 等传统方法在特异性和准确性方面表现出色,但在灵敏度和总音高误差方面效果不佳。另一方面,建议的 PEF+CEP 混合方法显著提高了准确度,达到了 98.8%,灵敏度也达到了 99.2%。混合方法的 GPE 略高于某些传统方法,但其在准确性和灵敏度方面的显著提高抵消了这些微小的偏差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Exploring Hybrid Techniques for Enhanced Pitch Estimation in Speech Processing
1. To develop a hybrid approach combining the Pitch Estimation Filter (PEF) and Cepstrum Pitch Determination (CPD) methods for pitch detection in audio signals. 2. To conduct comparative analysis with existing pitch detection methodologies, including Normalized Correlation Function (NCF), Pitch Estimation Filter (PEF), Log-Harmonic Summation (LHS), Summation of Residual Harmonics (SRH) and Cepstrum Pitch Determination (CEP), to assess the performance and accuracy of the proposed hybrid approach. 3. To evaluate the effectiveness of the hybrid approach in various real-world applications such as speech recognition and music transcription, using performance metrics including Gross Pitch Error (GPE) and classification accuracy through a K-Nearest Neighbors (KNN) classifier. The study discussed the difficulties in assessing pitch detection algorithms in real-world applications, especially when it comes to audio synthesis and music production. Prominent performance metrics and criteria pertinent to pitch tracking in interactive music applications were identified by the authors through comprehensive user studies and surveys with audio engineers and professional musicians. The results demonstrated the need for user-centered design approaches in algorithm development and evaluation by emphasizing the significance of taking user preferences and practical requirements into account when evaluating the effectiveness of pitch detection algorithms. 1. To develop a hybrid approach combining the Pitch Estimation Filter (PEF) and Cepstrum Pitch Determination (CPD) methods for pitch detection in audio signals. 2. To conduct comparative analysis with existing pitch detection methodologies, including Normalized Correlation Function (NCF), Pitch Estimation Filter (PEF), Log-Harmonic Summation (LHS), Summation of Residual Harmonics (SRH) and Cepstrum Pitch Determination (CEP), to assess the performance and accuracy of the proposed hybrid approach. 3. To evaluate the effectiveness of the hybrid approach in various real-world applications such as speech recognition and music transcription, using performance metrics including Gross Pitch Error (GPE) and classification accuracy through a K-Nearest Neighbors (KNN) classifier. Proposed PEF+CEP Finally, a comparison and analysis of different pitch detection techniques revealed how well they performed in terms of important evaluation metrics like accuracy, specificity, sensitivity, and gross pitch error (GPE). Conventional methods such as Normalized Correlation Function (NCF), Pitch Estimation Filter (PEF), Log-Harmonic Summation (LHS), Summation of Residual Harmonics(SRH) and Cepstrum Pitch Determination (CEP) perform admirably in terms of specificity and accuracy, but they are not very effective in terms of sensitivity and GPE. On the other hand, the suggested hybrid approach, Proposed PEF+CEP, offers a noteworthy enhancement in accuracy, attaining a remarkable 98.8%, in addition to a sensitivity of 99.2%. The hybrid approach exhibits a slightly higher GPE than some traditional methods, but these minor deviations are outweighed by the significant improvements in accuracy and sensitivity that it offers. Furthermore, the Proposed PEF+CEP method is a promising solution for reliable and accurate pitch detection in speech processing applications because it strikes a strong balance between computational efficiency, training time, model size, and convergence rate. The suggested method offers notable improvements in pitch detection accuracy and reliability while addressing the drawbacks of separate approaches by utilizing the advantages of both PEF and CEP techniques. As a result, the suggested PEF+CEP approach stands out as a significant advancement in speech processing, offering enhanced functionality and versatility in a range of real-world settings. Finally, a comparison and analysis of different pitch detection techniques revealed how well they performed in terms of important evaluation metrics like accuracy, specificity, sensitivity, and gross pitch error (GPE). Conventional methods such as Normalized Correlation Function (NCF), Pitch Estimation Filter (PEF), Log-Harmonic Summation (LHS), Summation of Residual Harmonics(SRH) and Cepstrum Pitch Determination (CEP) perform admirably in terms of specificity and accuracy, but they are not very effective in terms of sensitivity and GPE. On the other hand, the suggested hybrid approach, Proposed PEF+CEP, offers a noteworthy enhancement in accuracy, attaining a remarkable 98.8%, in addition to a sensitivity of 99.2%. The hybrid approach exhibits a slightly higher GPE than some traditional methods, but these minor deviations are outweighed by the significant improvements in accuracy and sensitivity that it offers. Furthermore, the Proposed PEF+CEP method is a promising solution for reliable and accurate pitch detection in speech processing applications because it strikes a strong balance between computational efficiency, training time, model size, and convergence rate. The suggested method offers notable improvements in pitch detection accuracy and reliability while addressing the drawbacks of separate approaches by utilizing the advantages of both PEF and CEP techniques. As a result, the suggested PEF+CEP approach stands out as a significant advancement in speech processing, offering enhanced functionality and versatility in a range of real-world settings. Pitch detection algorithms could become even more complex and effective with more research and development in this area, enabling improvements in text-to-speech synthesis, speaker Identification, And Speech Recognition, Among Other Fields. Nil
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Recent Patents on Engineering
Recent Patents on Engineering Engineering-Engineering (all)
CiteScore
1.40
自引率
0.00%
发文量
100
期刊介绍: Recent Patents on Engineering publishes review articles by experts on recent patents in the major fields of engineering. A selection of important and recent patents on engineering is also included in the journal. The journal is essential reading for all researchers involved in engineering sciences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信