基于支持向量机的中文歌词伴奏识别

2016 International Conference on Audio, Language and Image Processing (ICALIP) Pub Date : 2016-07-01 DOI:10.1109/ICALIP.2016.7846536

Juanjuan Cai, Na Li, Hui Wang, Bin Zhu

{"title":"基于支持向量机的中文歌词伴奏识别","authors":"Juanjuan Cai, Na Li, Hui Wang, Bin Zhu","doi":"10.1109/ICALIP.2016.7846536","DOIUrl":null,"url":null,"abstract":"The speech recognition technology is one of the hot spots in the field of audio technology. For the recognition of the lyrics with the accompaniment, there are two commonly used methods, one is applying automatic speech recognition technology to singing recognition, the other way is using sound classification, extracting audio features, and then using pattern matching classifier for classification. In this paper, we use sound classification method, adopt self-built experimental database where 31 classes Chinese isolated lyrics (Total 4650) are intercepted from different songs. And then use these words as the units. Considering speaking and singing sharing similar mechanism, we extract 39-dimensional MFCC feature parameters which are widely used in speech recognition. Combined with training materials, adjust kernel parameters and choose functions to train SVM classifier. After that, the trained SVM classification system is used to recognize the lyrics, and the average recognition accuracy rate is 42.80%.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Isolated Chinese lyrics with accompaniment recognition based on SVM\",\"authors\":\"Juanjuan Cai, Na Li, Hui Wang, Bin Zhu\",\"doi\":\"10.1109/ICALIP.2016.7846536\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The speech recognition technology is one of the hot spots in the field of audio technology. For the recognition of the lyrics with the accompaniment, there are two commonly used methods, one is applying automatic speech recognition technology to singing recognition, the other way is using sound classification, extracting audio features, and then using pattern matching classifier for classification. In this paper, we use sound classification method, adopt self-built experimental database where 31 classes Chinese isolated lyrics (Total 4650) are intercepted from different songs. And then use these words as the units. Considering speaking and singing sharing similar mechanism, we extract 39-dimensional MFCC feature parameters which are widely used in speech recognition. Combined with training materials, adjust kernel parameters and choose functions to train SVM classifier. After that, the trained SVM classification system is used to recognize the lyrics, and the average recognition accuracy rate is 42.80%.\",\"PeriodicalId\":184170,\"journal\":{\"name\":\"2016 International Conference on Audio, Language and Image Processing (ICALIP)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Conference on Audio, Language and Image Processing (ICALIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICALIP.2016.7846536\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICALIP.2016.7846536","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

语音识别技术是音频技术领域的研究热点之一。对于有伴奏的歌词的识别，常用的方法有两种，一种是应用自动语音识别技术进行演唱识别，另一种是利用声音分类，提取音频特征，再使用模式匹配分类器进行分类。本文采用声音分类的方法，采用自建的实验数据库，从不同的歌曲中截取31类中文孤立歌词(共4650个)。然后用这些词作为单位。考虑到说话和唱歌共享相似机制，我们提取了广泛应用于语音识别的39维MFCC特征参数。结合训练资料，调整核参数，选择函数，训练SVM分类器。然后使用训练好的SVM分类系统对歌词进行识别，平均识别准确率为42.80%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Isolated Chinese lyrics with accompaniment recognition based on SVM

The speech recognition technology is one of the hot spots in the field of audio technology. For the recognition of the lyrics with the accompaniment, there are two commonly used methods, one is applying automatic speech recognition technology to singing recognition, the other way is using sound classification, extracting audio features, and then using pattern matching classifier for classification. In this paper, we use sound classification method, adopt self-built experimental database where 31 classes Chinese isolated lyrics (Total 4650) are intercepted from different songs. And then use these words as the units. Considering speaking and singing sharing similar mechanism, we extract 39-dimensional MFCC feature parameters which are widely used in speech recognition. Combined with training materials, adjust kernel parameters and choose functions to train SVM classifier. After that, the trained SVM classification system is used to recognize the lyrics, and the average recognition accuracy rate is 42.80%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 International Conference on Audio, Language and Image Processing (ICALIP)

自引率

0.00%

发文量