J. P. Martinez-Avila, Adrian Hazzard, A. Chamberlain, C. Greenhalgh, S. Benford
{"title":"An AI-Based Design Framework to Support Musicians' Practices","authors":"J. P. Martinez-Avila, Adrian Hazzard, A. Chamberlain, C. Greenhalgh, S. Benford","doi":"10.1145/3243274.3275381","DOIUrl":"https://doi.org/10.1145/3243274.3275381","url":null,"abstract":"The practice of working musicians extends beyond the act of performing musical works at a concert. Rather, a significant degree of individual and collaborative preparation is necessitated prior to the moment of presentation to an audience. Increasingly, these musicians call upon a range of digital resources and tools to support this 'living' process. We present a speculative design paper in response to a set of ethnographies and interviews with working musicians to highlight the potential contemporary digital technologies and services can bring to bear in supporting, enhancing and guiding musicians' preparation and practice. We acknowledge the role that artificial intelligence and semantic technologies could play in the design of tools that interface with the traditional practice of musicians and their instruments.","PeriodicalId":129628,"journal":{"name":"Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115601964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Meico","authors":"A. Berndt, Simon Waloschek, Aristotelis Hadjakos","doi":"10.1145/3243274.3243282","DOIUrl":"https://doi.org/10.1145/3243274.3243282","url":null,"abstract":"MEI, the established representation format for digital music editions, barely finds consideration in other music-related communities. Reasons are the format's complexity and ambiguity that make processing expensive and laborious. On the other hand, digital music editions are an invaluable source of symbolic music data and further accompanying information far beyond the typical metadata found in other formats. With meico, we provide a novel tool that makes access, processing and use of MEI encoded music more convenient and appealing for other application scenarios. Meico is a converter framework that translates MEI data to a series of formats relevant to many other applications. With ScoreTube we demonstrate this in an audio-to-score alignment scenario.","PeriodicalId":129628,"journal":{"name":"Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121266196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Collaborative Artificial Intelligence in Music Production","authors":"Steve Nicholls, Stuart Cunningham, R. Picking","doi":"10.1145/3243274.3243311","DOIUrl":"https://doi.org/10.1145/3243274.3243311","url":null,"abstract":"The use of technology has revolutionized the process of music composition, recording, and production in the last 30 years. One fusion of technology and music that has been longstanding is the use of artificial intelligence in the process of music composition. However, much less attention has been given to the application of AI in the process of collaboratively composing and producing a piece of recorded music. The aim of this project is to explore such use of artificial intelligence in music production. The research presented here includes discussion of an auto ethnographic study of the interactions between songwriters, with the intention that these can be used to model the collaborative process and that a computational system could be trained using this information. The research indicated that there were repeated patterns that occurred in relation to the interactions of the participating songwriters.","PeriodicalId":129628,"journal":{"name":"Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128381426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Federico Simonetta, Filippo Carnovalini, N. Orio, A. Rodà
{"title":"Symbolic Music Similarity through a Graph-Based Representation","authors":"Federico Simonetta, Filippo Carnovalini, N. Orio, A. Rodà","doi":"10.1145/3243274.3243301","DOIUrl":"https://doi.org/10.1145/3243274.3243301","url":null,"abstract":"In this work, a novel representation system for symbolic music is described. The proposed representation system is graph-based and could theoretically represent music both from a horizontal (contrapuntal) and from a vertical (harmonic) point of view, by keeping into account contextual and harmonic information. It could also include relationships between internal variations of motifs and themes. This is achieved by gradually simplifying the melodies and generating layers of reductions that include only the most important notes from a structural and harmonic viewpoint. This representation system has been tested in a music information retrieval task, namely melodic similarity, and compared to another system that performs the same task but does not consider any contextual or harmonic information, showing how the structural information is needed in order to find certain relations between musical pieces. Moreover, a new dataset consisting of more than 5000 leadsheets is presented, with additional meta-musical information taken from different web databases, including author, year of first performance, lyrics, genre and stylistic tags.","PeriodicalId":129628,"journal":{"name":"Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123670979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MozziByte Workshop: Making Things Purr Growl and Sing","authors":"S. Barrass","doi":"10.1145/3243274.3243315","DOIUrl":"https://doi.org/10.1145/3243274.3243315","url":null,"abstract":"Mozzi is a sound synthesiser that allows the Arduino to purr, growl and sing, instead of just buzzing the way it usually does.... Mozzi lets you create sounds using familiar synthesis units including oscillators, samples, delays, filters and envelopes. These sounds can be embedded in clothing, appliances, sports equipment, gadgets, toys, installations, and many other places where sound not been possible before. In this workshop you will make a a small interactive synthesiser using the MozziByte, so you can make almost anything purr, growl and sing.","PeriodicalId":129628,"journal":{"name":"Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116223120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Music Genre Classification: Genre-Specific Characterization and Pairwise Evaluation","authors":"Adam Lefaivre, John Z. Zhang","doi":"10.1145/3243274.3243310","DOIUrl":"https://doi.org/10.1145/3243274.3243310","url":null,"abstract":"In this paper, we report our initial investigations on the genre classification problem in Music Information Retrieval. Each music genre has its unique characteristics, which distinguish it from other genres. We adapt association analysis and use it to capture those characteristics using acoustic features, i.e., each genre's characteristics are represented by a set of features and their corresponding values. In addition, we consider that each candidate genre should have its own chance to be singled out, and compete for a new piece to be classified. Therefore, we conduct genre classification based on a pairwise dichotomy-like strategy. We compare the differences of the characteristics of two genres in a symmetric manner and use them to classify music genres. The effectiveness of our approach is demonstrated through empirical experiments on one benchmark music dataset. The results are presented and discussed. Various related issues, such as potential future work along the same direction, are examined.","PeriodicalId":129628,"journal":{"name":"Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124131374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Returning to the Fundamentals on Temperament (In Digital Systems)","authors":"Nathan Renney, Benedict R. Gaster, Tom Mitchell","doi":"10.1145/3243274.3243308","DOIUrl":"https://doi.org/10.1145/3243274.3243308","url":null,"abstract":"Considering the generation of musical tunings, it is reasonable to expect that the many constructs contained in Functional programming languages may provide useful tools for exploring both conventional and new tunings. In this paper we present a number of approaches for manipulating tunings using basic mathematics. While this provides a simple foundation for describing temperament, it is fundamental enough to support a variety of approaches and further, allows the unbounded description of arbitrary tunings. It is expected that this notion will be useful in defining tunings, and by extension scales, for Digital Musical Instruments. This breaks down the physical barrier that has limited the likes of just intonations from having practical applications in the performance setting. It also enables composers to explore a variety of non traditional temperaments rapidly, without having to manually tune each note.","PeriodicalId":129628,"journal":{"name":"Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion","volume":"25 5-6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133282623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maximos A. Kaliakatsos-Papakostas, Aggelos Gkiokas, V. Katsouros
{"title":"Interactive Control of Explicit Musical Features in Generative LSTM-based Systems","authors":"Maximos A. Kaliakatsos-Papakostas, Aggelos Gkiokas, V. Katsouros","doi":"10.1145/3243274.3243296","DOIUrl":"https://doi.org/10.1145/3243274.3243296","url":null,"abstract":"Long Short-Term Memory (LSTM) neural networks have been effectively applied on learning and generating musical sequences, powered by sophisticated musical representations and integrations into other deep learning models. Deep neural networks, alongside LSTM-based systems, learn implicitly: given a sufficiently large amount of data, they transform information into high-level features that, however, do not relate with the high-level features perceived by humans. For instance, such models are able to compose music in the style of the Bach chorales, but they are not able to compose a less rhythmically dense version of them, or a Bach choral that begins with low and ends with high pitches -- even more so in an interactive way in real-time. This paper presents an approach to creating such systems. A very basic LSTM-based architecture is developed that can compose music that corresponds to user-provided values of rhythm density and pitch height/register. A small initial dataset is augmented to incorporate more intense variations of these two features and the system learns and generates music that not only reflects the style, but also (and most importantly) reflects the features that are explicitly given as input at each specific time. This system -- and future versions that will incorporate more advanced architectures and representation -- is suitable for generating music the features of which are defined in real-time and/or interactively.","PeriodicalId":129628,"journal":{"name":"Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116280099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Immersive VisualAudioDesign: Spectral Editing in VR","authors":"Lars Engeln, Natalie Hube, Rainer Groh","doi":"10.1145/3243274.3243279","DOIUrl":"https://doi.org/10.1145/3243274.3243279","url":null,"abstract":"VisualAudioDesign (VAD) is an attempt to design audio in a visual way. The frequency-domain visualized as a spectrogram construed as pixel data can be manipulated with image filters. Thereby, an approach is described to get away from direct DSP parameter manipulation to a more comprehensible sound design. Virtual Reality (VR) offers immersive insights into data and embodied interaction in the virtual environment. VAD and VR combined enrich spectral editing with a natural work-flow. Therefore, a design paper prototype for interaction with audio data in an virtual environment was used and examined.","PeriodicalId":129628,"journal":{"name":"Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116861856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Method for Virtual Acoustic Auralisation in VR","authors":"Callum Forsyth","doi":"10.1145/3243274.3243304","DOIUrl":"https://doi.org/10.1145/3243274.3243304","url":null,"abstract":"In today's industry, the use of prediction software in architectural acoustics is universal. Programs such as Odeon, CATT and CadnaA have become an integral part of the design process. These programs combine general acoustic theory with CAD modelling software to calculate the trajectory and intensity of sound waves as they travel around the room. By deciding upon positioning for sound sources for and listening positions, acousticians can track both the direction and level of a sound wave as it arrives at the listener. The basic theory then is that with this information we can map out a three-dimensional representation of how the source would sound to the listener before the room is built. This is known as virtual auralisation, creating a sonic map of a virtual room that is understandable to the listener because it mimics the acoustic standards of the real world. If the aim is to immerse the listener in the virtual world then the key is localisation. Allowing the listener to pinpoint which direction both the sound and its subsequent reflections are coming from is crucial to analysing the effect that acoustic design elements have on the overall sound. While surround sound could be looked to as an option, Odeon will also output to ambisonics b-format which can then be encoded for virtual reality. As a medium VR has been around for a while, however it is only recently with the release of relatively affordable platforms such as the HTC Vive and Oculus Rift that VR has gained mainstream appeal and with it, the support and infrastructure to encourage third party support for everything from games to VR experiences to virtual learning environments. In terms of acoustics, VR allows the listener to hear the sound source from any chosen position in the virtual space with full localisation and in three dimensions, effectively creating a fully realised, acoustically accurate virtual environment. One of the first companies to utilise this technology was the global architecture firm Arup. The SoundLab project is the most famous example of virtual auralisation for acoustic modelling and has become a benchmark for the industry and a showpiece for Arup. Though still utilising ambisonics, the SoundLab neglects to use VR headtracking and a binaural output. Instead opting to place the listener in the centre of an anechoic chamber with around twelve speakers surrounding them. While this is a far more expensive option, it does offer greatly increased sound quality. Through this project I will aim to apply the concept of Viral Auralisation through the medium of virtual reality to discuss the possibility of real time VR auralisation and its potential.","PeriodicalId":129628,"journal":{"name":"Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129831860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}