{"title":"Automatic acoustic classification of feline sex","authors":"Maksim Kukushkin, S. Ntalampiras","doi":"10.1145/3478384.3478385","DOIUrl":"https://doi.org/10.1145/3478384.3478385","url":null,"abstract":"This paper presents a novel method for classifying the feline sex based on the respective vocalizations. Due to the size of the available dataset, we rely on tree-based classifiers which can efficiently learn classification rules in such poor data availability cases. More specifically, this work investigates the ability of random forests and boosting classifiers when trained with a wide range of acoustic features derived both from time and frequency domain. The considered classifiers are evaluated using standardized figures of merit including f1-score, recall, precision, and accuracy. The best-performing classifier was the CatBoost, while the obtained results are in line with the state-of-the-art accuracy levels in the field of animal sex classification.","PeriodicalId":173309,"journal":{"name":"Proceedings of the 16th International Audio Mostly Conference","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122996578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Neuro-curation: A case study on the use of sonic enhancement of virtual museum exhibits","authors":"Duncan A. H. Williams, I. Daly","doi":"10.1145/3478384.3478428","DOIUrl":"https://doi.org/10.1145/3478384.3478428","url":null,"abstract":"For the past several years, museums have widely embraced virtual exhibits—certainly before COVID-19, but especially after the virus's outbreak, which has required cultural institutions to temporarily close their physical sites to audiences. Indeed, even once these institutions reopen and the world returns to a new normal, virtual exhibits will remain a defining feature of museums: partly as a means to expand audiences, and partly as a way to increase revenue generation. This paper describes a case study in which a variety of soundscapes were presented accompanying a number of VR objects from the British Museum, in order to determine whether there was any appreciable improvement in viewer engagement with different types of soundscape. Soundscapes were created using synthesis, combinations of foley style effects, spoken word narration, and musique concrete based on a palette drawn from the International Affective Sounds Database. Participants (N=95) were asked to rate their engagement in an online experiment - engagement was highest in the foley-style soundscape condition. This work has implications for future targeted soundscape design, in order to target individual engagement and to facilitate exhibit evaluation, a field we describe as “neuro-curation”.","PeriodicalId":173309,"journal":{"name":"Proceedings of the 16th International Audio Mostly Conference","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122136227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Patrice Guyot, Fanny Alix, Thomas Guérin, Elie Lambeaux, Alexis Rotureau
{"title":"Fish migration monitoring from audio detection with CNNs","authors":"Patrice Guyot, Fanny Alix, Thomas Guérin, Elie Lambeaux, Alexis Rotureau","doi":"10.1145/3478384.3478393","DOIUrl":"https://doi.org/10.1145/3478384.3478393","url":null,"abstract":"The monitoring of migratory fish is essential to evaluate the state of the fish population in freshwater and follow its evolution. During spawning in rivers, some species of alosa produce a characteristic splash sound, called “bull”, that enables to perceive their presence. Stakeholders involved in the rehabilitation of freshwater ecosystems rely on staff to aurally count the bulls during spring nights and then estimate the alosa population in different sites. In order to reduce the human costs and expand the scope of study, we propose a deep learning approach for audio event detection from recordings made from the river banks. Two different models of Convolutional Neural Networks (CNNs), namely AlexNet and VGG-16, have been tested. Encouraging results enable us to aim for a semi-automatized and production oriented implementation.","PeriodicalId":173309,"journal":{"name":"Proceedings of the 16th International Audio Mostly Conference","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124480491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sonification for Conveying Data and Emotion","authors":"N. Rönnberg","doi":"10.1145/3478384.3478387","DOIUrl":"https://doi.org/10.1145/3478384.3478387","url":null,"abstract":"In the present study a sonification of running data was evaluated. The aim of the sonification was to both convey information about the data and convey a specific emotion. The sonification was evaluated in three parts, firstly as an auditory graph, secondly together with additional text information, and thirdly together with an animated visualization, with a total of 150 responses. The results suggest that the sonification could convey an emotion similar to that intended, but at the cost of less good representation of the data. The addition of visual information supported understanding of the sonification, and the auditory representation of data. The results thus suggest that it is possible to design sonification that is perceived as both interesting and fun, and convey an emotional impression, but that there may be a trade off between musical experience and clarity in sonification.","PeriodicalId":173309,"journal":{"name":"Proceedings of the 16th International Audio Mostly Conference","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132552376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Lostanlen, Antoine Bernabeu, Jean-Luc Béchennec, M. Briday, S. Faucou, M. Lagrange
{"title":"Energy Efficiency is Not Enough:Towards a Batteryless Internet of Sounds","authors":"V. Lostanlen, Antoine Bernabeu, Jean-Luc Béchennec, M. Briday, S. Faucou, M. Lagrange","doi":"10.1145/3478384.3478408","DOIUrl":"https://doi.org/10.1145/3478384.3478408","url":null,"abstract":"This position paper advocates for digital sobriety in the design and usage of wireless acoustic sensors. As of today, these devices all rely on batteries, which are either recharged by a human operator or via solar panels. Yet, batteries contain chemical pollutants and have a shorter lifespan than electronic components: as such, they hinder the autonomy and sustainability of the Internet of Sounds at large. Against this problem, our radical answer is to avoid the use of batteries altogether; and instead, to harvest ambient energy in real time and store it in a supercapacitor allowing a few minutes of operation. We show the inherent limitations of battery-dependent technologies for acoustic sensing. Then, we describe how a low-cost Micro-Controller Unit (MCU) could serve for audio acquisition and feature extraction on the edge. In particular, we stress the advantage of storing intermediate computations in ferroelectric random-access memory (FeRAM), which is nonvolatile, fast, endurant and consumes little. As a proof of concept, we present a simple-minded detector of sine tones in background noise, which relies on a fixed-point implementation of the fast Fourier transform (FFT). We outline future directions towards bioacoustic event detection and urban acoustic monitoring without batteries nor wires.","PeriodicalId":173309,"journal":{"name":"Proceedings of the 16th International Audio Mostly Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130349933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Singing Gallery: Combining Music and Static Visual Artworks","authors":"Oliver Bramah, Xiaoling Cheng, Fabio Morreale","doi":"10.1145/3478384.3478396","DOIUrl":"https://doi.org/10.1145/3478384.3478396","url":null,"abstract":"This paper presents, The Singing Gallery, an artistic exploration developed to discover the extent to which music can affect one's viewing experience of static visual artworks in a gallery setting. This project was conceived to explore integration between vision and sound in an art gallery environment. Specifically, we aimed to assess the extent to which musical cues can enhance or modify the viewer's experience of abstract art. The Singing Gallery takes shape in the form of a virtual reality (VR) art gallery, featuring a selection of visual artworks. Each artwork is associated with a unique piece of acousmatic music, which has been specifically composed for by the first author, which increases in volume as the audience member moves towards the painting. In the paper, we discuss the epistemological grounding and detail the development of the interactive VR system. We then offer concluding remarks based on auto-ethnographic reflections.","PeriodicalId":173309,"journal":{"name":"Proceedings of the 16th International Audio Mostly Conference","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122910507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding Cross-Genre Rhythmic Audio Compatibility: A Computational Approach","authors":"Cláudio Lemos, Diogo Cocharro, Gilberto Bernardes","doi":"10.1145/3478384.3478418","DOIUrl":"https://doi.org/10.1145/3478384.3478418","url":null,"abstract":"Rhythmic similarity, a fundamental task within Music Information Retrieval, has recently been applied in creative music contexts to retrieve musical audio or guide audio-content transformations. However, there is still very little knowledge of the typical rhythmic similarity values between overlapping musical structures per instrument, genre, and time scales, which we denote as rhythmic compatibility. This research provides the first steps towards the understanding of rhythmic compatibility from the systematic analysis of MedleyDB, a large multi-track musical database composed and performed by artists. We apply computational methods to compare database stems using representative rhythmic similarity metrics – Rhythmic Histogram (RH) and Beat Spectrum (BS) – per genre and instrumental families and to understand whether RH and BS are prone to discriminate genres at different time scales. Our results suggest that 1) rhythmic compatibility values lie between [.002,.354] (RH) and [.1,.881] (BS), 2) RH outperforms BS in discriminating genres, and 3) different time scale in RH and BS impose significant differences in rhythmic compatibility.","PeriodicalId":173309,"journal":{"name":"Proceedings of the 16th International Audio Mostly Conference","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125148674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Zugzwang: Chess Representation Combining Sonification and Interactive Performance","authors":"Francesco Ardan Dal Rì, Raul Masu","doi":"10.1145/3478384.3478394","DOIUrl":"https://doi.org/10.1145/3478384.3478394","url":null,"abstract":"In this paper we present the design process and the preliminary evaluations of Zugzwang. The system implemented a mixed approach to represent a chess game by combining direct sonification of the moves with instrumental improvisation. The improvisation is guided by a score displayed in real-time that represent the thinking process of the player based on the duration of the moves. To design the system we involved a professional chess player to provide suggestions. We also present a preliminary evaluation of the system with feedback from two chess players.","PeriodicalId":173309,"journal":{"name":"Proceedings of the 16th International Audio Mostly Conference","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129636239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparing Chat Methods for Remote Collaborative Live-Coding Music","authors":"Chad Bullard, Ananya Kansal, Jason Freeman","doi":"10.1145/3478384.3478399","DOIUrl":"https://doi.org/10.1145/3478384.3478399","url":null,"abstract":"This study analyzes the impact of text versus audio chat in a remote collaborative live-coding music environment. Drawing on the literature discussing collaborative live-coding music, we wanted to examine the impacts of different chat methods on collaboration and coordination. This paper will explore the benefits and drawbacks of two different chat methods, and how they affect remote performances. We conducted a study with 5 participants, each of these sessions containing two 15 minute performance tasks where only one chat method was allowed to be used for collaboration. After the performance tasks, we interviewed the participants about their experience and how each chat method affected their performance. Audio-chat proved useful for real-time coordination and complex explanations, while text-chat was beneficial in its silence and recorded nature for future review of ideas. This paper will discuss these merits for a remote collaborative live-coding environment.","PeriodicalId":173309,"journal":{"name":"Proceedings of the 16th International Audio Mostly Conference","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116416032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Glenn McGarry, A. Chamberlain, Andy Crabtree, Christopher Greenhalgh
{"title":"The Meaning in \"the Mix\": Using Ethnography to Inform the Design of Intelligent Tools in the Context of Music Production","authors":"Glenn McGarry, A. Chamberlain, Andy Crabtree, Christopher Greenhalgh","doi":"10.1145/3478384.3478406","DOIUrl":"https://doi.org/10.1145/3478384.3478406","url":null,"abstract":"In this paper we report on two ethnographic studies of professional music producers at work in their respective studio settings, to underpin the design of intelligent tools and platforms in this domain. The studies are part of a body of work that explores this complex and technically challenging domain and explicates the ways in which the actors involved address the tension between artistic and engineering practices. This report focusses on the flow of work in the creation of a song in a digital audio workstation (DAW), which often eschews the technical requirement to document the process to maintain a creative “vibe”, and the troubles that occur in subsequent stages of the production process in which complex and often messy compositions of audio data are handed over. We conclude with implications for metadata used in the process, namely the labelling and organisation of audio, to drive tools that allow more control over the creative process by automating process tracking and documenting song data provenance.","PeriodicalId":173309,"journal":{"name":"Proceedings of the 16th International Audio Mostly Conference","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132891485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}