Semi-Supervised NMF in the chroma Domain Applied to Music Harmony Estimation

Takuya Takahashi, T. Hori, Christoph M. Wilk, S. Sagayama
{"title":"Semi-Supervised NMF in the chroma Domain Applied to Music Harmony Estimation","authors":"Takuya Takahashi, T. Hori, Christoph M. Wilk, S. Sagayama","doi":"10.23919/APSIPA.2018.8659645","DOIUrl":null,"url":null,"abstract":"In this paper, we discuss non-negative matrix factorization (NMF) applied to chroma feature sequences to reduce the chroma-specific noise in chord estimation from music signals using the hidden Markov model (HMM). Even in the case of single pitch sounds, the raw 12-dimensional chroma vectors obtained from the music signal by summing and normalizing the spectrum by octaves often contain irrelevant components such as non-octave overtones falling into different pitch classes and cause inaccuracies in estimation of harmonies. NMF applied to the chroma domain is expected to suppress such chroma components in the NMF activation matrix caused by overtones, and thus “purifies” the noisy chroma vectors. By reducing the dimensionality to 12 dimensions as opposed to NMF applied to the raw spectrum, we expect advantages with respect to statistical robustness as well as computational cost for pitch class estimation of single and multiple tones. We use the “purified” chroma vectors in combination with a harmony progression model based on an HMM where the NMF activation distributions are modeled as observations associated with hidden harmonies, whose transition probabilities have been obtained statistically. We attempt to improve harmony estimation accuracy by combining suppression of irrelevant components and the HMM-based harmony model. In the experimental evaluation, we demonstrate the reduction of irrelevant components in raw chroma vectors computed from recordings of musical instruments. In addition, using music audio data with harmony annotation from the RWC database, we compare the harmony estimation accuracies using our method and conventional chroma.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/APSIPA.2018.8659645","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, we discuss non-negative matrix factorization (NMF) applied to chroma feature sequences to reduce the chroma-specific noise in chord estimation from music signals using the hidden Markov model (HMM). Even in the case of single pitch sounds, the raw 12-dimensional chroma vectors obtained from the music signal by summing and normalizing the spectrum by octaves often contain irrelevant components such as non-octave overtones falling into different pitch classes and cause inaccuracies in estimation of harmonies. NMF applied to the chroma domain is expected to suppress such chroma components in the NMF activation matrix caused by overtones, and thus “purifies” the noisy chroma vectors. By reducing the dimensionality to 12 dimensions as opposed to NMF applied to the raw spectrum, we expect advantages with respect to statistical robustness as well as computational cost for pitch class estimation of single and multiple tones. We use the “purified” chroma vectors in combination with a harmony progression model based on an HMM where the NMF activation distributions are modeled as observations associated with hidden harmonies, whose transition probabilities have been obtained statistically. We attempt to improve harmony estimation accuracy by combining suppression of irrelevant components and the HMM-based harmony model. In the experimental evaluation, we demonstrate the reduction of irrelevant components in raw chroma vectors computed from recordings of musical instruments. In addition, using music audio data with harmony annotation from the RWC database, we compare the harmony estimation accuracies using our method and conventional chroma.
色度域半监督NMF在音乐和声估计中的应用
本文讨论了将非负矩阵分解(NMF)应用于色度特征序列,以降低隐马尔可夫模型(HMM)在音乐信号和弦估计中的色度噪声。即使在单音高的情况下,通过按八度对频谱求和和归一化而从音乐信号中获得的原始12维色度向量通常包含不相关的成分,例如落入不同音高类别的非八度泛音,从而导致和声估计的不准确。应用于色度域的NMF有望抑制由泛音引起的NMF激活矩阵中的色度成分,从而“净化”有噪声的色度向量。通过将维数降至12维,而不是将NMF应用于原始频谱,我们期望在统计鲁棒性以及单个和多个音调的音高类别估计的计算成本方面具有优势。我们将“纯化”色度向量与基于HMM的和声级数模型结合使用,其中NMF激活分布被建模为与隐藏和声相关的观测值,其转移概率已统计获得。我们尝试将不相关分量的抑制与基于hmm的和声模型相结合来提高和声估计的精度。在实验评估中,我们展示了从乐器录音中计算的原始色度向量中不相关成分的减少。此外,利用RWC数据库中带有和声标注的音乐音频数据,比较了本文方法与传统色度方法的和声估计精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信