{"title":"一个强大的歌唱旋律跟踪器使用自适应圆形半音(ARS)","authors":"Chong-kai Wang, Ren-Yuan Lyu, Yuang-Chin Chiang","doi":"10.1109/ISPA.2003.1296957","DOIUrl":null,"url":null,"abstract":"In this paper, an approach for melody tracking is proposed and applied to applications of automatic singing transcription. The melody tracker is based on adaptive round semitones (ARS) algorithm, which converts a pitch contour of singing voice to a sequence of music notes. The pitch of singing voice is usually much more unstable than that of musical instruments. A poor-skilled singer may generate voice with even worse pitch correctness. ARS deals with these issues by using a statistic model, which predicts singers' tune scale of the current note dynamically. Compared with the other approaches, ARS achieves the lowest error rate for poor singers and seems much more insensitive to the diversity of singers' singing skills. Furthermore, by adding on the transcription process a heuristic music grammar constraints based on music theory, the error rate can be reduced 20.5%, which beats all the other approaches mentioned in the other literatures.","PeriodicalId":218932,"journal":{"name":"3rd International Symposium on Image and Signal Processing and Analysis, 2003. ISPA 2003. Proceedings of the","volume":"98 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"A robust singing melody tracker using adaptive round semitones (ARS)\",\"authors\":\"Chong-kai Wang, Ren-Yuan Lyu, Yuang-Chin Chiang\",\"doi\":\"10.1109/ISPA.2003.1296957\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, an approach for melody tracking is proposed and applied to applications of automatic singing transcription. The melody tracker is based on adaptive round semitones (ARS) algorithm, which converts a pitch contour of singing voice to a sequence of music notes. The pitch of singing voice is usually much more unstable than that of musical instruments. A poor-skilled singer may generate voice with even worse pitch correctness. ARS deals with these issues by using a statistic model, which predicts singers' tune scale of the current note dynamically. Compared with the other approaches, ARS achieves the lowest error rate for poor singers and seems much more insensitive to the diversity of singers' singing skills. Furthermore, by adding on the transcription process a heuristic music grammar constraints based on music theory, the error rate can be reduced 20.5%, which beats all the other approaches mentioned in the other literatures.\",\"PeriodicalId\":218932,\"journal\":{\"name\":\"3rd International Symposium on Image and Signal Processing and Analysis, 2003. ISPA 2003. Proceedings of the\",\"volume\":\"98 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"3rd International Symposium on Image and Signal Processing and Analysis, 2003. ISPA 2003. Proceedings of the\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISPA.2003.1296957\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"3rd International Symposium on Image and Signal Processing and Analysis, 2003. ISPA 2003. Proceedings of the","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPA.2003.1296957","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A robust singing melody tracker using adaptive round semitones (ARS)
In this paper, an approach for melody tracking is proposed and applied to applications of automatic singing transcription. The melody tracker is based on adaptive round semitones (ARS) algorithm, which converts a pitch contour of singing voice to a sequence of music notes. The pitch of singing voice is usually much more unstable than that of musical instruments. A poor-skilled singer may generate voice with even worse pitch correctness. ARS deals with these issues by using a statistic model, which predicts singers' tune scale of the current note dynamically. Compared with the other approaches, ARS achieves the lowest error rate for poor singers and seems much more insensitive to the diversity of singers' singing skills. Furthermore, by adding on the transcription process a heuristic music grammar constraints based on music theory, the error rate can be reduced 20.5%, which beats all the other approaches mentioned in the other literatures.