PDMX：用于符号音乐处理的大规模公共领域音乐 XML 数据集

arXiv - EE - Audio and Speech Processing Pub Date : 2024-09-17 DOI:arxiv-2409.10831

Phillip Long, Zachary Novack, Taylor Berg-Kirkpatrick, Julian McAuley

{"title":"PDMX：用于符号音乐处理的大规模公共领域音乐 XML 数据集","authors":"Phillip Long, Zachary Novack, Taylor Berg-Kirkpatrick, Julian McAuley","doi":"arxiv-2409.10831","DOIUrl":null,"url":null,"abstract":"The recent explosion of generative AI-Music systems has raised numerous\nconcerns over data copyright, licensing music from musicians, and the conflict\nbetween open-source AI and large prestige companies. Such issues highlight the\nneed for publicly available, copyright-free musical data, in which there is a\nlarge shortage, particularly for symbolic music data. To alleviate this issue,\nwe present PDMX: a large-scale open-source dataset of over 250K public domain\nMusicXML scores collected from the score-sharing forum MuseScore, making it the\nlargest available copyright-free symbolic music dataset to our knowledge. PDMX\nadditionally includes a wealth of both tag and user interaction metadata,\nallowing us to efficiently analyze the dataset and filter for high quality\nuser-generated scores. Given the additional metadata afforded by our data\ncollection process, we conduct multitrack music generation experiments\nevaluating how different representative subsets of PDMX lead to different\nbehaviors in downstream models, and how user-rating statistics can be used as\nan effective measure of data quality. Examples can be found at\nhttps://pnlong.github.io/PDMX.demo/.","PeriodicalId":501284,"journal":{"name":"arXiv - EE - Audio and Speech Processing","volume":"210 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing\",\"authors\":\"Phillip Long, Zachary Novack, Taylor Berg-Kirkpatrick, Julian McAuley\",\"doi\":\"arxiv-2409.10831\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The recent explosion of generative AI-Music systems has raised numerous\\nconcerns over data copyright, licensing music from musicians, and the conflict\\nbetween open-source AI and large prestige companies. Such issues highlight the\\nneed for publicly available, copyright-free musical data, in which there is a\\nlarge shortage, particularly for symbolic music data. To alleviate this issue,\\nwe present PDMX: a large-scale open-source dataset of over 250K public domain\\nMusicXML scores collected from the score-sharing forum MuseScore, making it the\\nlargest available copyright-free symbolic music dataset to our knowledge. PDMX\\nadditionally includes a wealth of both tag and user interaction metadata,\\nallowing us to efficiently analyze the dataset and filter for high quality\\nuser-generated scores. Given the additional metadata afforded by our data\\ncollection process, we conduct multitrack music generation experiments\\nevaluating how different representative subsets of PDMX lead to different\\nbehaviors in downstream models, and how user-rating statistics can be used as\\nan effective measure of data quality. Examples can be found at\\nhttps://pnlong.github.io/PDMX.demo/.\",\"PeriodicalId\":501284,\"journal\":{\"name\":\"arXiv - EE - Audio and Speech Processing\",\"volume\":\"210 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - EE - Audio and Speech Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.10831\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Audio and Speech Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10831","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

最近，人工智能音乐生成系统的爆炸式发展引发了许多关于数据版权、音乐家音乐授权以及开源人工智能与大型知名公司之间冲突的问题。这些问题凸显了对可公开获取、无版权限制的音乐数据的需求，而这方面的数据尤其是符号音乐数据非常缺乏。为了缓解这一问题，我们推出了 PDMX：一个大型开源数据集，其中包含从乐谱共享论坛 MuseScore 收集的超过 25 万个公共领域的 MusicXML 乐谱，是我们所知的最大的可用无版权符号音乐数据集。PDMX 还包括大量标签和用户交互元数据，使我们能够高效地分析数据集，并筛选出高质量的用户生成乐谱。鉴于我们的数据收集过程提供了额外的元数据，我们进行了多轨音乐生成实验，评估 PDMX 的不同代表性子集如何导致下游模型的不同行为，以及用户评分统计如何用作数据质量的有效衡量标准。示例可在https://pnlong.github.io/PDMX.demo/。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing

The recent explosion of generative AI-Music systems has raised numerous concerns over data copyright, licensing music from musicians, and the conflict between open-source AI and large prestige companies. Such issues highlight the need for publicly available, copyright-free musical data, in which there is a large shortage, particularly for symbolic music data. To alleviate this issue, we present PDMX: a large-scale open-source dataset of over 250K public domain MusicXML scores collected from the score-sharing forum MuseScore, making it the largest available copyright-free symbolic music dataset to our knowledge. PDMX additionally includes a wealth of both tag and user interaction metadata, allowing us to efficiently analyze the dataset and filter for high quality user-generated scores. Given the additional metadata afforded by our data collection process, we conduct multitrack music generation experiments evaluating how different representative subsets of PDMX lead to different behaviors in downstream models, and how user-rating statistics can be used as an effective measure of data quality. Examples can be found at https://pnlong.github.io/PDMX.demo/.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - EE - Audio and Speech Processing

自引率

0.00%

发文量