Jom Kuriakose, J. Kumar, Sarala Padi, H. Murthy, Umayalpuram K. Sivaraman
{"title":"Akshara transcription of mrudangam strokes in Carnatic music","authors":"Jom Kuriakose, J. Kumar, Sarala Padi, H. Murthy, Umayalpuram K. Sivaraman","doi":"10.1109/NCC.2015.7084906","DOIUrl":null,"url":null,"abstract":"Percussion instruments play a significant role in Carnatic music concerts. The percussion artist enjoys a great degree of freedom in improvising within the defined tala structure of a composition. The objective of this paper is to transcribe the improvisations, treating the percussion strokes as syllables or aksharas. Onset detection is performed to segment the waveform at each akshara. Using the transcriptions from the training data, a three-state Hidden Markov Model is built for each akshara. The language model is derived from the training data. Testing is also performed isolated style using onset detection to segment the phrase, and the language model to correct the transcription. Transcription is performed on both concert recordings and studio recordings. This technique yields upto ≈ 96% accuracy on studio recordings and ≈ 76% accuracy for concert recordings. As the mrudangam1 is an instrument that is based on tonic; tonic normalised features, namely, Cent Filterbank Cepstral coefficients are used. It is shown that tonic normalisation helps in transcription across different tonics.","PeriodicalId":302718,"journal":{"name":"2015 Twenty First National Conference on Communications (NCC)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Twenty First National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC.2015.7084906","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
Percussion instruments play a significant role in Carnatic music concerts. The percussion artist enjoys a great degree of freedom in improvising within the defined tala structure of a composition. The objective of this paper is to transcribe the improvisations, treating the percussion strokes as syllables or aksharas. Onset detection is performed to segment the waveform at each akshara. Using the transcriptions from the training data, a three-state Hidden Markov Model is built for each akshara. The language model is derived from the training data. Testing is also performed isolated style using onset detection to segment the phrase, and the language model to correct the transcription. Transcription is performed on both concert recordings and studio recordings. This technique yields upto ≈ 96% accuracy on studio recordings and ≈ 76% accuracy for concert recordings. As the mrudangam1 is an instrument that is based on tonic; tonic normalised features, namely, Cent Filterbank Cepstral coefficients are used. It is shown that tonic normalisation helps in transcription across different tonics.