Drew Edwards, Xavier Riley, Pedro Sarmento, Simon Dixon
{"title":"MIDI-to-Tab:通过掩码语言建模进行吉他谱推理","authors":"Drew Edwards, Xavier Riley, Pedro Sarmento, Simon Dixon","doi":"arxiv-2408.05024","DOIUrl":null,"url":null,"abstract":"Guitar tablatures enrich the structure of traditional music notation by\nassigning each note to a string and fret of a guitar in a particular tuning,\nindicating precisely where to play the note on the instrument. The problem of\ngenerating tablature from a symbolic music representation involves inferring\nthis string and fret assignment per note across an entire composition or\nperformance. On the guitar, multiple string-fret assignments are possible for\nmost pitches, which leads to a large combinatorial space that prevents\nexhaustive search approaches. Most modern methods use constraint-based dynamic\nprogramming to minimize some cost function (e.g.\\ hand position movement). In\nthis work, we introduce a novel deep learning solution to symbolic guitar\ntablature estimation. We train an encoder-decoder Transformer model in a masked\nlanguage modeling paradigm to assign notes to strings. The model is first\npre-trained on DadaGP, a dataset of over 25K tablatures, and then fine-tuned on\na curated set of professionally transcribed guitar performances. Given the\nsubjective nature of assessing tablature quality, we conduct a user study\namongst guitarists, wherein we ask participants to rate the playability of\nmultiple versions of tablature for the same four-bar excerpt. The results\nindicate our system significantly outperforms competing algorithms.","PeriodicalId":501178,"journal":{"name":"arXiv - CS - Sound","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MIDI-to-Tab: Guitar Tablature Inference via Masked Language Modeling\",\"authors\":\"Drew Edwards, Xavier Riley, Pedro Sarmento, Simon Dixon\",\"doi\":\"arxiv-2408.05024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Guitar tablatures enrich the structure of traditional music notation by\\nassigning each note to a string and fret of a guitar in a particular tuning,\\nindicating precisely where to play the note on the instrument. The problem of\\ngenerating tablature from a symbolic music representation involves inferring\\nthis string and fret assignment per note across an entire composition or\\nperformance. On the guitar, multiple string-fret assignments are possible for\\nmost pitches, which leads to a large combinatorial space that prevents\\nexhaustive search approaches. Most modern methods use constraint-based dynamic\\nprogramming to minimize some cost function (e.g.\\\\ hand position movement). In\\nthis work, we introduce a novel deep learning solution to symbolic guitar\\ntablature estimation. We train an encoder-decoder Transformer model in a masked\\nlanguage modeling paradigm to assign notes to strings. The model is first\\npre-trained on DadaGP, a dataset of over 25K tablatures, and then fine-tuned on\\na curated set of professionally transcribed guitar performances. Given the\\nsubjective nature of assessing tablature quality, we conduct a user study\\namongst guitarists, wherein we ask participants to rate the playability of\\nmultiple versions of tablature for the same four-bar excerpt. The results\\nindicate our system significantly outperforms competing algorithms.\",\"PeriodicalId\":501178,\"journal\":{\"name\":\"arXiv - CS - Sound\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Sound\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.05024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Sound","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.05024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MIDI-to-Tab: Guitar Tablature Inference via Masked Language Modeling
Guitar tablatures enrich the structure of traditional music notation by
assigning each note to a string and fret of a guitar in a particular tuning,
indicating precisely where to play the note on the instrument. The problem of
generating tablature from a symbolic music representation involves inferring
this string and fret assignment per note across an entire composition or
performance. On the guitar, multiple string-fret assignments are possible for
most pitches, which leads to a large combinatorial space that prevents
exhaustive search approaches. Most modern methods use constraint-based dynamic
programming to minimize some cost function (e.g.\ hand position movement). In
this work, we introduce a novel deep learning solution to symbolic guitar
tablature estimation. We train an encoder-decoder Transformer model in a masked
language modeling paradigm to assign notes to strings. The model is first
pre-trained on DadaGP, a dataset of over 25K tablatures, and then fine-tuned on
a curated set of professionally transcribed guitar performances. Given the
subjective nature of assessing tablature quality, we conduct a user study
amongst guitarists, wherein we ask participants to rate the playability of
multiple versions of tablature for the same four-bar excerpt. The results
indicate our system significantly outperforms competing algorithms.