International Society for Music Information Retrieval Conference最新文献_第7页

Kiite Cafe: A Web Service for Getting Together Virtually to Listen to Music Kiite Cafe:一个让大家聚在一起听音乐的网络服务

International Society for Music Information Retrieval Conference Pub Date : 2021-11-07 DOI: 10.5281/ZENODO.5624491

Kosetsu Tsukuda, Keisuke Ishida, Masahiro Hamasaki, Masataka Goto

{"title":"Kiite Cafe: A Web Service for Getting Together Virtually to Listen to Music","authors":"Kosetsu Tsukuda, Keisuke Ishida, Masahiro Hamasaki, Masataka Goto","doi":"10.5281/ZENODO.5624491","DOIUrl":"https://doi.org/10.5281/ZENODO.5624491","url":null,"abstract":"In light of the COVID-19 pandemic making it difficult for people to get together in person, this paper describes a public web service called Kiite Cafe that lets users get together virtually to listen to music. When users listen to music on Kiite Cafe, their experiences are characterized by two architectures: (i) visualization of each user’s reactions, and (ii) selection of songs from users’ favorite songs. These architectures enable users to feel social connection with others and the joy of introducing others to their favorite songs as if they were together in person to listen to music. In addition, the architectures provide three user experiences: (1) motivation to react to played songs, (2) the opportunity to listen to a diverse range of songs, and (3) the opportunity to contribute as curators. By analyzing the behavior logs of 1,760 Kiite Cafe users over about five months, we quantitatively show that these user experiences can generate various effects (e.g., users react to a more diverse range of songs on Kiite Cafe than when listening alone). We also discuss how our proposed architectures can continue to enrich music listening experiences with others even after the pandemic’s resolution.","PeriodicalId":309903,"journal":{"name":"International Society for Music Information Retrieval Conference","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121824062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Semi-supervised Music Tagging Transformer 半监督音乐标签转换器

International Society for Music Information Retrieval Conference Pub Date : 2021-11-07 DOI: 10.5281/ZENODO.5624405

Minz Won, Keunwoo Choi, Xavier Serra

引用次数: 29

Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation 基于深度ResUNet的音乐源分离解耦幅度和相位估计

International Society for Music Information Retrieval Conference Pub Date : 2021-09-12 DOI: 10.5281/ZENODO.5624475

Qiuqiang Kong, Yin Cao, Haohe Liu, Keunwoo Choi, Yuxuan Wang

{"title":"Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation","authors":"Qiuqiang Kong, Yin Cao, Haohe Liu, Keunwoo Choi, Yuxuan Wang","doi":"10.5281/ZENODO.5624475","DOIUrl":"https://doi.org/10.5281/ZENODO.5624475","url":null,"abstract":"Deep neural network based methods have been successfully applied to music source separation. They typically learn a mapping from a mixture spectrogram to a set of source spectrograms, all with magnitudes only. This approach has several limitations: 1) its incorrect phase reconstruction degrades the performance, 2) it limits the magnitude of masks between 0 and 1 while we observe that 22% of time-frequency bins have ideal ratio mask values of over~1 in a popular dataset, MUSDB18, 3) its potential on very deep architectures is under-explored. Our proposed system is designed to overcome these. First, we propose to estimate phases by estimating complex ideal ratio masks (cIRMs) where we decouple the estimation of cIRMs into magnitude and phase estimations. Second, we extend the separation method to effectively allow the magnitude of the mask to be larger than 1. Finally, we propose a residual UNet architecture with up to 143 layers. Our proposed system achieves a state-of-the-art MSS result on the MUSDB18 dataset, especially, a SDR of 8.98~dB on vocals, outperforming the previous best performance of 7.24~dB. The source code is available at: this https URL","PeriodicalId":309903,"journal":{"name":"International Society for Music Information Retrieval Conference","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130986258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 59

EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation EMOPIA:用于情感识别和基于情感的音乐生成的多模态流行钢琴数据集

International Society for Music Information Retrieval Conference Pub Date : 2021-07-18 DOI: 10.5281/ZENODO.5090631

Hsiao-Tzu Hung, Joann Ching, Seungheon Doh, Nabin Kim, Juhan Nam, Yi-Hsuan Yang

{"title":"EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation","authors":"Hsiao-Tzu Hung, Joann Ching, Seungheon Doh, Nabin Kim, Juhan Nam, Yi-Hsuan Yang","doi":"10.5281/ZENODO.5090631","DOIUrl":"https://doi.org/10.5281/ZENODO.5090631","url":null,"abstract":"While there are many music datasets with emotion labels in the literature, they cannot be used for research on symbolic-domain music analysis or generation, as there are usually audio files only. In this paper, we present the EMOPIA (pronounced `yee-mo-pi-uh') dataset, a shared multi-modal (audio and MIDI) database focusing on perceived emotion in pop piano music, to facilitate research on various tasks related to music emotion. The dataset contains 1,087 music clips from 387 songs and clip-level emotion labels annotated by four dedicated annotators. Since the clips are not restricted to one clip per song, they can also be used for song-level analysis. We present the methodology for building the dataset, covering the song list curation, clip selection, and emotion annotation processes. Moreover, we prototype use cases on clip-level music emotion classification and emotion-based symbolic music generation by training and evaluating corresponding models using the dataset. The result demonstrates the potential of EMOPIA for being used in future exploration on piano emotion-related MIR tasks.","PeriodicalId":309903,"journal":{"name":"International Society for Music Information Retrieval Conference","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125713276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 44

Intelligent User Interfaces for Music Discovery: The Past 20 Years and What's to Come 音乐发现的智能用户界面:过去的20年和未来的趋势

International Society for Music Information Retrieval Conference Pub Date : 2020-10-16 DOI: 10.5334/TISMIR.60

Peter Knees, M. Schedl, Masataka Goto

{"title":"Intelligent User Interfaces for Music Discovery: The Past 20 Years and What's to Come","authors":"Peter Knees, M. Schedl, Masataka Goto","doi":"10.5334/TISMIR.60","DOIUrl":"https://doi.org/10.5334/TISMIR.60","url":null,"abstract":"Assisting the user in finding music is one of the original motivations that led to the establishment of Music Information Retrieval (MIR) as a research field. This encompasses classic Information Retrieval inspired access to music repositories that aims at meeting an information need of an expert user. Beyond this, however, music as a cultural art form is also connected to an entertainment need of potential listeners, requiring more intuitive and engaging means for music discovery. A central aspect in this process is the user interface. In this article, we reflect on the evolution of MIR-driven intelligent user interfaces for music browsing and discovery over the past two decades. We argue that three major developments have transformed and shaped user interfaces during this period, each connected to a phase of new listening practices. Phase 1 has seen the development of content-based music retrieval interfaces built upon audio processing and content description algorithms facilitating the automatic organization of repositories and finding music according to sound qualities. These interfaces are primarily connected to personal music collections or (still) small commercial catalogs. Phase 2 comprises interfaces incorporating collaborative and automatic semantic description of music, exploiting knowledge captured in user-generated metadata. These interfaces are connected to collective web platforms. Phase 3 is dominated by recommender systems built upon the collection of online music interaction traces on a large scale. These interfaces are connected to streaming services. We review and contextualize work from all three phases and extrapolate current developments to outline possible scenarios of music recommendation and listening interfaces of the future.","PeriodicalId":309903,"journal":{"name":"International Society for Music Information Retrieval Conference","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114822792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

Should we consider the users in contextual music auto-tagging models? 我们是否应该考虑上下文音乐自动标记模型中的用户?

International Society for Music Information Retrieval Conference Pub Date : 2020-10-12 DOI: 10.5281/ZENODO.3961560

Karim M. Ibrahim, Elena V. Epure, G. Peeters, G. Richard

引用次数: 5

User Perceptions Underlying Social Music Behavior 用户感知是社会音乐行为的基础

International Society for Music Information Retrieval Conference Pub Date : 2020-10-11 DOI: 10.5281/ZENODO.4245474

Louis Spinelli, Josephine Lau, Jin Ha Lee

引用次数: 0

Pandemics, music, and collective sentiment: evidence from the outbreak of COVID-19 流行病、音乐和集体情绪:来自COVID-19爆发的证据

International Society for Music Information Retrieval Conference Pub Date : 2020-10-11 DOI: 10.5281/ZENODO.4245394

Meijun Liu, Eva Zangerle, Xiao Hu, Alessandro B. Melchiorre, M. Schedl

{"title":"Pandemics, music, and collective sentiment: evidence from the outbreak of COVID-19","authors":"Meijun Liu, Eva Zangerle, Xiao Hu, Alessandro B. Melchiorre, M. Schedl","doi":"10.5281/ZENODO.4245394","DOIUrl":"https://doi.org/10.5281/ZENODO.4245394","url":null,"abstract":"The COVID-19 pandemic causes a massive global health crisis and produces substantial economic and social distress, which in turn may cause stress and anxiety among people. Real-world events play a key role in shaping collective sentiment in a society. As people listen to music daily everywhere in the world, the sentiment of music being listened to can reflect the mood of the listeners and serve as a measure of collective sentiment. However, the exact relationship between real-world events and the sentiment of music being listened to is not clear. Driven by this research gap, we use the unexpected outbreak of COVID-19 as a natural experiment to explore how users' sentiment of music being listened to evolves before and during the outbreak of the pandemic. We employ causal inference approaches on an extended version of the LFM-1b dataset of listening events shared on Last.fm, to examine the impact of the pandemic on the sentiment of music listened to by users in different countries. We find that, after the first COVID-19 case in a country was confirmed, the sentiment of artists users listened to becomes more negative. This negative effect is pronounced for males while females' music emotion is less influenced by the outbreak of the COVID-19 pandemic. We further find a negative association between the number of new weekly COVID-19 cases and users' music sentiment. Our results provide empirical evidence that public sentiment can be monitored based on collective music listening behaviors, which can contribute to research in related disciplines.","PeriodicalId":309903,"journal":{"name":"International Society for Music Information Retrieval Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129554504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Multi-Instrument Music Transcription Based on Deep Spherical Clustering of Spectrograms and Pitchgrams 基于声谱图和音高图深度球面聚类的多乐器音乐转录

International Society for Music Information Retrieval Conference Pub Date : 2020-10-11 DOI: 10.5281/ZENODO.4245436

Keitaro Tanaka, Takayuki Nakatsuka, Ryo Nishikimi, Kazuyoshi Yoshii, S. Morishima

{"title":"Multi-Instrument Music Transcription Based on Deep Spherical Clustering of Spectrograms and Pitchgrams","authors":"Keitaro Tanaka, Takayuki Nakatsuka, Ryo Nishikimi, Kazuyoshi Yoshii, S. Morishima","doi":"10.5281/ZENODO.4245436","DOIUrl":"https://doi.org/10.5281/ZENODO.4245436","url":null,"abstract":"This paper describes a clustering-based music transcription method that estimates the piano rolls of arbitrary musical instrument parts from multi-instrument polyphonic music signals. If target musical pieces are always played by particular kinds of musical instruments, a way to obtain piano rolls is to compute the pitchgram (pitch saliency spectrogram) of each musical instrument by using a deep neural network (DNN). However, this approach has a critical limitation that it has no way to deal with musical pieces including undefined musical instruments. To overcome this limitation, we estimate a condensed pitchgram with an existing instrument-independent neural multi-pitch estimator and then separate the pitchgram into a specified number of musical instrument parts with a deep spherical clustering technique. To improve the performance of transcription, we propose a joint spectrogram and pitchgram clustering method based on the timbral and pitch characteristics of musical instruments. The experimental results show that the proposed method can transcribe musical pieces including unknown musical instruments as well as those containing only predefined instruments, at the state-of-the-art transcription accuracy.","PeriodicalId":309903,"journal":{"name":"International Society for Music Information Retrieval Conference","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132081400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Voice-Leading Schema Recognition Using Rhythm and Pitch Features 使用节奏和音高特征的语音引导模式识别

International Society for Music Information Retrieval Conference Pub Date : 2020-10-11 DOI: 10.5281/ZENODO.4245482

Christoph Finkensiep, Ken Déguernel, M. Neuwirth, M. Rohrmeier

引用次数: 1