Chi Yoon Jeong, Youngmi Song, Sungyong Shin, Mooseop Kim
{"title":"用于边缘设备的高效螺距估计网络","authors":"Chi Yoon Jeong, Youngmi Song, Sungyong Shin, Mooseop Kim","doi":"10.4218/etrij.2023-0430","DOIUrl":null,"url":null,"abstract":"<p>Pitch estimation is the task of finding the most conspicuous frequency in a complex audio signal. Many methods that use deep neural networks have significantly increased the accuracy of pitch estimation; however, their real-time performance results were achieved on high-performance devices. Because pitch estimation is widely used in real-time applications on low-power devices, we propose an efficient method for estimating pitch on edge devices. The network architecture of the proposed method uses a depth-scaling strategy and fully leverages convolutional networks. We further introduce a channel attention mechanism to increase accuracy without increasing computational overhead. We compared the proposed model with state-of-the-art (SOTA) and conventional methods using two public datasets. The experimental results show that the proposed method has a better classification accuracy than FCNF0++, which is the best performing SOTA model. Furthermore, it reduces the processing time obtained by FCNF0++ on a personal computer and two edge devices by 48% on average. These experimental results confirm that the proposed method efficiently classifies pitch on edge devices.</p>","PeriodicalId":11901,"journal":{"name":"ETRI Journal","volume":"47 1","pages":"112-122"},"PeriodicalIF":1.3000,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.4218/etrij.2023-0430","citationCount":"0","resultStr":"{\"title\":\"Efficient pitch-estimation network for edge devices\",\"authors\":\"Chi Yoon Jeong, Youngmi Song, Sungyong Shin, Mooseop Kim\",\"doi\":\"10.4218/etrij.2023-0430\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Pitch estimation is the task of finding the most conspicuous frequency in a complex audio signal. Many methods that use deep neural networks have significantly increased the accuracy of pitch estimation; however, their real-time performance results were achieved on high-performance devices. Because pitch estimation is widely used in real-time applications on low-power devices, we propose an efficient method for estimating pitch on edge devices. The network architecture of the proposed method uses a depth-scaling strategy and fully leverages convolutional networks. We further introduce a channel attention mechanism to increase accuracy without increasing computational overhead. We compared the proposed model with state-of-the-art (SOTA) and conventional methods using two public datasets. The experimental results show that the proposed method has a better classification accuracy than FCNF0++, which is the best performing SOTA model. Furthermore, it reduces the processing time obtained by FCNF0++ on a personal computer and two edge devices by 48% on average. These experimental results confirm that the proposed method efficiently classifies pitch on edge devices.</p>\",\"PeriodicalId\":11901,\"journal\":{\"name\":\"ETRI Journal\",\"volume\":\"47 1\",\"pages\":\"112-122\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2024-05-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.4218/etrij.2023-0430\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ETRI Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.4218/etrij.2023-0430\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ETRI Journal","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.4218/etrij.2023-0430","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
摘要
音高估计是在复杂音频信号中找到最明显频率的任务。许多使用深度神经网络的方法显著提高了音高估计的准确性,但它们的实时性能结果都是在高性能设备上实现的。由于音高估计广泛应用于低功耗设备上的实时应用,我们提出了一种在边缘设备上估计音高的高效方法。所提方法的网络架构采用了深度缩放策略,并充分利用了卷积网络。我们进一步引入了通道关注机制,在不增加计算开销的情况下提高了准确性。我们使用两个公共数据集将所提出的模型与最先进的方法(SOTA)和传统方法进行了比较。实验结果表明,与性能最好的 SOTA 模型 FCNF0++ 相比,所提出的方法具有更好的分类准确性。此外,它还将 FCNF0++ 在一台个人电脑和两台边缘设备上的处理时间平均缩短了 48%。这些实验结果证实,所提出的方法能有效地对边缘设备上的间距进行分类。
Efficient pitch-estimation network for edge devices
Pitch estimation is the task of finding the most conspicuous frequency in a complex audio signal. Many methods that use deep neural networks have significantly increased the accuracy of pitch estimation; however, their real-time performance results were achieved on high-performance devices. Because pitch estimation is widely used in real-time applications on low-power devices, we propose an efficient method for estimating pitch on edge devices. The network architecture of the proposed method uses a depth-scaling strategy and fully leverages convolutional networks. We further introduce a channel attention mechanism to increase accuracy without increasing computational overhead. We compared the proposed model with state-of-the-art (SOTA) and conventional methods using two public datasets. The experimental results show that the proposed method has a better classification accuracy than FCNF0++, which is the best performing SOTA model. Furthermore, it reduces the processing time obtained by FCNF0++ on a personal computer and two edge devices by 48% on average. These experimental results confirm that the proposed method efficiently classifies pitch on edge devices.
期刊介绍:
ETRI Journal is an international, peer-reviewed multidisciplinary journal published bimonthly in English. The main focus of the journal is to provide an open forum to exchange innovative ideas and technology in the fields of information, telecommunications, and electronics.
Key topics of interest include high-performance computing, big data analytics, cloud computing, multimedia technology, communication networks and services, wireless communications and mobile computing, material and component technology, as well as security.
With an international editorial committee and experts from around the world as reviewers, ETRI Journal publishes high-quality research papers on the latest and best developments from the global community.