Unfortunately, due to delays in collecting all Authors' Rights, The full proceedings will be published after the conference. Under ACM Publication policy, we are now able to release the abstracts only. Please, visit this page again in the near future.
Paper session 1 - Music
Wednesday 30th August, 11 am – Lindsay Stuart Lecture Theatre
Session Chair, Rod Selfridge (Edinburgh Napier University)
Instrumental Agency and the Co-Production of Sound: From South Asian Instruments to Interactive Systems
Omar Shabbar and Doug Van Nort (York University)
In this paper, we will look at sympathetic resonance as seen in South Asian instruments as a source of complex performer-instrument interaction. In particular, we will compare this rich tradition to the various types of human/machine interactions that arise in digital instruments endowed with computational agency. In reflecting on the spectrum of agency that exists between the extremes of instrumental performance and machine partnership, we will arrive at two concepts to help frame our study of complex interactions in acoustic instruments: the co-production of sound and material agency. As a case study, we asked musicians of these South Asian instruments questions about their perceived relationship with their sympathetic strings. Building upon this, we designed and created an interactive system that models the phenomenon of performing with sympathetic strings. We then asked musicians to interact with this new system and answer questions based on this experience. The results of these sessions were examined both to uncover any similarities between the two sets of interviews, and to situate this entangled performer-instrument interaction with respect to markers of perceived control, influence, co-creation, and agency.
Stringesthesia: Dynamically Shifting Musical Agency Between Audience and Performer Based on Trust in an Interactive and Improvised Performance
Torin Hopkins, Emily Doherty, Netta Ofer, Suibi Che Chuan Weng, Peter Gyory, Chad Tobin, Leanne Hirshfield and Ellen Yi-Luen Do (University of Colorado Boulder)
This paper introduces Stringesthesia, an interactive and improvised performance paradigm. Stringesthesia uses real-time neuroimaging to connect performers and audiences, enabling direct access to the performer’s mental state and determining audience participation during the performance. Functional near-infrared spectroscopy (fNIRS), a noninvasive neuroimaging tool, was used to assess metabolic activity of brain areas collectively associated with a metric we call “trust”. A visualization representing the real-time measurement of the performer’s level of trust was projected behind the performer and used to dynamically restrict or promote audience participation: e.g., as the performer’s trust in the audience grew, more participatory stations for playing drums and selecting the performer’s chords were activated. Throughout the paper we discuss prior work that heavily influenced our design, conceptual and methodological issues with using fNIRS technology, and our system architecture. We then describe feedback from the audience and performer in a performance setting with a solo guitar player.
Affective Conditional Modifiers in Adaptive Video Game Music
Tyler McIntosh, Orlando Woscholski and Mathieu Barthet (Queen Mary University of London)
This paper presents an application of affective conditional modifiers (ACMs) in adaptive video game music – a technique whereby the emotional intent of background music is adapted, based on biofeedback, to enforce a target emotion state in the player, thus providing a more immersive experience. The proposed methods are explored in a bespoke horror game titled "The Hidden", which uses ACMs to enforce states of calmness in stressed players, and states of stress in calm players, through the procedural adaptation of background music timbre and instrumentation. These two conditions, along with a control condition, are investigated through an experimental study. Due to the low number of participants, the results of the user study provide limited insight into the effectiveness of the proposed ACMs. Nevertheless, the experiment design and user feedback highlight a number of important considerations and potential directions for future work. Namely, the need for consideration of the individual affective profile of the player, the audio-visual and narrative cues that may reduce the impact of affective audio, the effects of game familiarity on affective responses, and the need for ACM thresholds that are well-suited to the context and narrative of the game.
Inner City in the Listener's Auditory Bubble: Altering the Listener's Perception of the Inner City through the Intervention of Composed Soundscapes
Hedda Lindström (Dalarna University), Tanja Jörgensen (Dalarna University) and Rikard Lindell (Mälardalen University)
This paper describes the effect on the listeners’ experience of headphone listening to a music composition including inner-city sound while being in an inner-city environment, using a research through design approach. The study focuses on the listeners’ described experiences through the lens of Berleant’s aesthetic sensibility and Bull’s phenomenon of the auditory bubble. We produce a composition which participants listen to in an urban context and discuss the two main themes found, soundtrack and awareness, together with the indications of the possibility to direct listeners’ attention and level of immersion by including inner-city ambience and sound in music when listening with headphones in an urban environment.
An Interactive Tool for Exploring Score-Aligned Performances: Opportunities for Enhanced Music Engagement
Caitlin Sales, Peiyi Wang and Yucong Jiang (University of Richmond)
Music scholars and enthusiasts have long been engaged with both performance recordings and musical scores, but inconveniently, these two closely connected mediums are usually stored separately. Currently, digital music libraries tend to have fairly traditional user interfaces for browsing music recordings, and more importantly, performance recordings are organized separately from their musical scores. In recent years, however, the same technological advances that have made vast troves of sound recordings and musical scores more widely available have also created tremendous potential for innovative new interfaces that can facilitate enhanced engagement with the music. In this paper, we present a web-based prototype tool that allows users to navigate classical piano recordings interactively.
Examining the Correlation Between Dance and Electroacoustic Music Phrases: A Pilot Study
Andreas Bergsland (Norwegian University of Science and Technology) and Sanket Rajeev Sabharwal (University of Genoa)
In this paper, we will present a pilot study that explores the relationship between music and movement in dance phrases spontaneously choreographed to follow phrases of electroacoustic music. Motion capture recordings from the dance phrases were analyzed to get measurements of contraction-expansion and kinematic features, and the temporal location of the peaks of these measurements was subsequently compared with the peaks of a set of audio features analyzed from the musical phrases. The analyses suggest that the dancers variably accentuate their movements to the peaks or accents in the music. The paper discusses the findings in addition to possible improvements of the research design in further studies.
Paper session 2 & Demo sessions - Music Information Retrieval (MIR)
Wednesday 30th August, 2 pm – Lindsay Stuart Lecture Theatre
Session Chair, KC Collins (Carleton University)
MMM Duet System: New accessible musical technology for people living with dementia
Justin Christensen (University of Sheffield), Shawn Kauenhofen, Janeen Loehr, Jennifer Lang, Shelley Peacock and Jennifer Nicol (University of Saskatchewan)
Music offers a meaningful way for people living with dementia to interact with others and can provide health and wellbeing benefits. Enjoying shared activities helps couples affected by dementia retain a sense of couplehood and can support a spousal caregiver’s mental health. This paper describes the development of the Music Memory Makers (MMM) Duet System, a prototype that has been developed as part of a qualitative, multi-phase, iterative research study to test its feasibility for use with people living with dementia and their spousal caregivers. Through the iterative process, the diverse individual needs of the participants directly led to the adding, adjusting, or removal of features and components to better fit their needs and to make the system require as little technical experience from the users as possible for quick and easy engagement. In line with our work of developing system hardware and software to meet users’ needs, including 3D printed cases, coordination facilitation processes, a visual interface, and source separation tools to create familiar duets, participants found the duet system offered them an opportunity to enjoyably interact with one another by playing meaningful songs together.
How reliable are posterior class probabilities in automatic music classification?
Hanna Lukashevich (IDMT), Sascha Grollmisch (IDMT), Jakob Abeßer (IDMT), Sebastian Stober (University Magdeburg) and Joachim Bös (IDMT)
Music classification algorithms use signal processing and machine learning approaches to extract and enrich metadata for audio recordings in music archives. Common tasks include music genre classification, where each song is assigned a single label (such as Rock, Pop, or Jazz), and musical instrument recognition. Since music metadata can be ambiguous, classification algorithms cannot always achieve fully accurate predictions. Therefore, our focus extends beyond the correctly estimated class labels to include realistic confidence values for each potential genre or instrument label. In practice, many state-of-the-art classification algorithms based on deep neural networks exhibit overconfident predictions, complicating the interpretation of final output values. In this work, we examine whether the issue of overconfident predictions and, consequently, non-representative confidence values, is also relevant to music genre classification and musical instrument recognition. Moreover, we describe cutting-edge techniques to mitigate this behavior and assess the impact of deep ensembles and temperature scaling in generating more realistic confidence outputs, which can be directly employed in real-world music tagging applications.
An Empirical Study on the Effectiveness of Feature Selection and Ensemble Learning Techniques for Music Genre Classification
Raad Shariat and John Zhang (University of Lethbridge)
Classical machine learning has long been utilized for classification and regression tasks, primarily focusing on tabular data or handcrafted features derived from various data modalities, such as music signals. Music Information Retrieval (MIR) is an emerging field that seeks to automate the management process of musical data. This paper explores the potential of employing ensemble learning techniques to enhance classification performance while assessing the impact of feature selection methods on accuracy and computational efficiency across three publicly available datasets: Spotify, TCC_CED, and GTZAN. The Spotify and TCC_CED datasets contain high-level musical features, such as energy, key, and duration, while the GTZAN dataset incorporates low-level acoustic features extracted from audio recordings. The empirical experiments and qualitative analysis reveal a significant performance improvement when employing ensemble learning techniques for handling high-level features. Furthermore, the findings suggest that applying appropriate feature selection methods can substantially reduce computational time. As a result, by strategically combining optimal feature selection and classification models, the performance can be boosted in terms of accuracy and computational time. This study provides insights for optimizing music genre classification tasks through the strategic selection and balancing of model performance, ensemble learning techniques, and feature selection methods, ultimately contributing to advancements of musical genre classification tasks in MIR.
Kiroll: A Gaze-Based Instrument for Quadriplegic Musicians Based on the Context-Switching Paradigm
Nicola Davanzo, Luca Valente, Luca Andrea Ludovico and Federico Avanzini (Università degli Studi di Milano)
In recent years, Accessible Digital Musical Instruments (ADMIs) designed for motor-impaired individuals that incorporate gazetracking technologies have become more prevalent. To ensure a reliable user experience and minimize delays between actions and sound production, interaction methods must be carefully studied. This paper presents Kiroll, an affordable and open-source software ADMI specifically designed for quadriplegic users. Kiroll can be played by motor-impaired users through eye gaze for note selection and breath for sound control. The interface features the infinite keyboards context-switching interaction method, which exploits the smooth-pursuit capabilities of human eyes to provide an indefinitely scrolling layout so as to resolve the Midas Touch issue typical of gaze-based interaction. This paper outlines Kiroll’s interaction paradigm, features, implementation processes, and design approach.
Paper session 3 - Sonification
Thursday 31st August, 9:30 am – Lindsay Stuart Lecture Theatre
Session Chair, John McGowan (Edinburgh Napier University)
Tuning Shared Hospital Spaces: Sound Zones in Healthcare
Kasper Fangel Skov, Peter Axel Nielsen and Jesper Kjeldskov (Aalborg University)
The problem of noise in hospitals is commonly tackled through noise abatement practices, which consider ’quietness’ as a quality indicator. However, the influence of positive or negative subjective reactions to these sounds are rarely examined. Recent efforts emphasize the importance of considering the benefits of wanted sound while minimizing unwanted noise to reach a positive healthcare soundscape. The authors identified sound zones in shared hospital spaces as a means to achieve this through sound separation, noise masking and designed sound zone content. Listening evaluations were conducted to evaluate subjective responses of individuals from hearing a hospital soundscape across a variety of sound zone interventions. The authors conclude that sound zone interventions in shared hospital spaces offer subjective benefits that move beyond noise reduction. As an area for future work, sound zone interventions will be deployed in hospital settings to study potential long-term restorative effects on patients and better working conditions for staff.
Using design dimensions to develop a multi-device audio experience through workshops and prototyping
David Geary (University of York), Jon Francombe (Bang & Olufsen), Kristian Hentschel (BBC R&D) and Damian Murphy (University of York)
Designing audio experiences for heterogeneous arrays of multiple devices is challenging, and researchers have tried to identify useful design practices. A set of design dimensions have been proposed, providing researchers and creative practitioners with a framework for understanding the different design considerations for multidevice audio; however, they have yet to be used for scoping and developing a new experience. This work investigates the utility of the design dimensions for exploring and prototyping new multidevice audio experiences. Three workshops were conducted with audio professionals to see how the design dimensions could be used to form new ideas. Using the resulting ideas, a multi-device audio system combining loudspeakers and earbuds, and an experience based on that system, were created and demonstrated. The design dimensions were found to be useful for understanding multi-device audio experiences and for quickly forming new ideas. In addition, the dimensions were a helpful reference during experience development for testing different design choices, particularly for audio allocation.
An Interactive Modular System for Electrophysiological DMIs
Francesco Di Maggio (MSH Paris Nord), Atau Tanaka (MSH Paris Nord), David Fierro (CICM - Université Paris 8) and Stephen Whitmarsh (Sorbonne Université)
We present an interactive modular system built in Cycling ‘74 Max and interfaced with Grame’s FAUST for the purpose of analyzing, processing and mapping electrophysiological signals to sound. The system architecture combines an understanding of domain-specific (biophysiological) signal processing techniques with a flexible, modular and user-friendly interface. We explain our design process and decisions towards artistic usability, while maintaining a clear electrophysiological data flow. The system allows users to customize and experiment with different configurations of sensors, signal processing and sound synthesis algorithms, and has been tested in a range of different musical settings from user studies to concerts with a diverse range of musicians.
The Air Listening Station: Bridging the gap between Sound Art and Sonification
Eric Larrieux and Mélia Roger (Zurich University of the Arts)
When developing auditory display systems, one must balance the tendency for sonification algorithms to produce potentially informative, but less engaging, direct representations of data, with more aesthetically pleasing transformations where the underlying data is prone to obfuscation. In a scientific communication context, the successful navigation of this continuum becomes increasingly critical. As such, we take air quality data as a vehicle to explore this concept, with the ultimate goal of raising awareness of declining air quality in modern urban landscapes in order to drive societal change in response. Employing an aesthetically driven, artistic practice-based approach, we transform field recordings into an ever-evolving soundscape using generative music and algorithmic composition methods. Specifically, we present a novel, real-time granular synthesis-based sonification method that draws upon auditory icon, parameter mapping, and model-based sonification concepts, to create an output that invites an emotional connection with the underlying data. Finally, we discuss the design implications and constraints of this approach, before challenging some fundamental assumptions and conventions of modern sonification practice, while advocating for a tighter integration between the worlds of traditional sonification and sound art.
Reflecting on qualitative and quantitative data to frame criteria for effective sonification design
Katharina Groß-Vogt (University of Music and Performing Arts Graz), Kajetan Enge (St. Pölten University of Applied Sciences) and Iohannes Zmölnig (University of Music and Performing Arts Graz)
A subjective stagnation in the field of sonification research has been discussed. However, sonification has spread in simpler forms. We present a data set from Google scholar that provides insights into the state of sonification research. Based on these data, the literature, and a small expert poll, we propose criteria for effective sonification design: the use of easily perceptible sounds, that are mapped naturally, do not contradict the data metaphor, and are appropriate to the task. A quantitative analysis of the data found no correlation between effective sonifications and the number of citations or the year of their publishing.
The Sound of the Future Home Workshop: Ideating Sonic Prototypes for Sustainable Energy Consumption
Yann Seznec, Sandra Pauletto, Cristian Bogdan and Elina Eriksson (KTH Royal Institute of Technology)
This paper describes an ideation workshop aiming to explore the intersection of sonic interactions and energy use. As part of a larger research project exploring the role that sound can play in efficient energy behaviours, the workshop encouraged users to look for overlaps between their home resource use, potential sonic feedback and the feelings and emotions elicited by both. The workshop design was successful in providing non-experts with space and tools to reflect on the complex relationship between household, sound, energy and our feelings towards them. On a more practical level, 15 “hotspots” were identified where sound and energy concerns could be potentially addressed with sonic interventions, and four speculative prototypes were developed during the workshop each one revealing original considerations and relationships between sound and energy to be developed further in future work.
Towards a Framework of Aesthetics in Sonic Interaction
Stuart Cunningham (University of Chester), Iain McGregor (Edinburgh Napier University), Jonathan Weinel (University of Greenwich), John Darby (Manchester Metropolitan University) and Tony Stockman (Queen Mary University of London)
As interaction design has advanced, increased attention has been directed to the role that aesthetics play in shaping factors of user experience. Historically stemming from philosophy and the arts, aesthetics in interaction design has gravitated towards visual aspects of interface design thus far, with sonic aesthetics being underrepresented. This article defines and describes key dimensions of sonic aesthetics by drawing upon the literature and the authors’ experiences as practitioners and researchers. A framework is presented for discussion and evaluation, which incorporates aspects of classical and expressive aesthetics. These aspects of aesthetics are linked to low-level audio features, contextual factors, and usercentred experiences. It is intended that this initial framework will serve as a lens for the design, and appraisal, of sounds in interaction scenarios and that it can be iterated upon in the future through experience and empirical research.
Paper session 4 - Artificial Intelligence (AI) and Machine Learning (ML)
Thursday 31st August, 2 pm – Lindsay Stuart Lecture Theatre
Session Chair, Stuart Cunningham (University of Chester)
FM Tone Transfer with Envelope Learning
Franco Santiago Caspe (Queen Mary University of London), Andrew McPherson (Imperial College London) and Mark Sandler (Queen Mary University of London)
Tone Transfer is a novel deep-learning technique for interfacing a sound source with a synthesizer, transforming the timbre of audio excerpts while keeping their musical form content. Due to its good audio quality results and continuous controllability, it has been recently applied in several audio processing tools. Nevertheless, it still presents several shortcomings related to poor sound diversity, and limited transient and dynamic rendering, which we believe hinder its possibilities of articulation and phrasing in a real-time performance context. In this work, we present a discussion on current Tone Transfer architectures for the task of controlling synthetic audio with musical instruments and discuss their challenges in allowing expressive performances. Next, we introduce Envelope Learning, a novel method for designing Tone Transfer architectures that map musical events using a training objective at the synthesis parameter level. Our technique can render note beginnings and endings accurately and for a variety of sounds; these are essential steps for improving musical articulation, phrasing, and sound diversity with Tone Transfer. Finally, we implement a VST plugin for real-time live use and discuss possibilities for improvement.
A Free Verbalization Method of Evaluating Sound Design: The Effectiveness of Artificially Intelligent Natural Language Processing Methods and Tools
Kc Collins and Hannah Johnston (Carleton University)
Research on sound design evaluation methodologies relating to connotation, or the evocation of mental imagery is limited. Prior tools for data analysis have fallen short, making the process time consuming and difficult: We explore here a variety of new AI powered Natural Language Processing tools to evaluate the data. Results showed that free verbalization is a fruitful method to answer some research questions about sound, giving rise to many interesting insights and leading to further research questions.
Supervised Contrastive Learning For Musical Onset Detection
James Bolt and György Fazekas (Queen Mary University of London)
This paper applies supervised contrastive learning to musical onset detection to alleviate the issue of noisy annotated data for onset datasets. The results are compared against a state-of-the-art, convolutional, cross-entropy model. Both models were trained on two datasets. The first dataset comprised of a manually annotated selection of music. This data was then augmented with inaccurate labelling to produce the second data set. When trained on the original data the supervised contrastive model produced an F1 score of 0.878. This was close to the cross-entropy model score of 0.888. This showed that supervised contrastive loss is applicable to onset detection but does not outperform cross-entropy models in an ideal training case. When trained on the augmented set the contrastive model consistently outperformed the cross-entropy model across increasing percentage inaccuracies, with a difference in F1 score of 0.1 for the most inaccurate data. This demonstrates the robustness of supervised contrastive learning with inaccurate data for onset detection, suggesting that supervised contrastive loss could provide a new onset detection architecture which is invariant to noise in the data or inaccuracies in labelling.
Onset Detection for String Instruments Using Bidirectional Temporal and Convolutional Recurrent Networks
Maciej Tomczak and Jason Hockman (Birmingham City University)
Recent work in note onset detection has centered on deep learning models such as recurrent neural networks (RNN), convolutional neural networks (CNN) and more recently temporal convolutional networks (TCN), which achieve high evaluation accuracies for onsets characterized by clear, well-defined transients, as found in percussive instruments. However, onsets with less transient presence, as found in string instrument recordings, still pose a relatively difficult challenge for state-of-the-art algorithms. This challenge is further exacerbated by a paucity of string instrument data containing expert annotations. In this paper, we propose two new models for onset detection using bidirectional temporal and recurrent convolutional networks, which generalise to polyphonic signals and string instruments. We perform evaluations of the proposed methods alongside state-of-the-art algorithms for onset detection on a benchmark dataset from the MIR community, as well as on a test set from a newly proposed dataset of string instrument recordings with note onset annotations, comprising approximately 40 minutes and over 8,000 annotated onsets with varied expressive playing styles. The results demonstrate the effectiveness of both presented models, as they outperform the state-of-the-art algorithms on string recordings while maintaining comparative performance on other types of music.
A Plugin for Neural Audio Synthesis of Impact Sound Effects
Zih Syuan Yang and Jason Hockman (Birmingham City University)
The term impact sound as referred to in this paper, can be broadly defined as the sudden burst of short-lasting impulsive noise generated by the collision of two objects. This type of sound effect is prevalent in multimedia productions. However, conventional methods of sourcing these materials are often costly in time and resources. This paper explores the potential of neural audio synthesis for generating realistic impact sound effects, targeted for use in multimedia such as films, games, and AR/VR. The designed system uses a Realtime Audio Variational autoEncoder (RAVE) model trained on a dataset of over 3,000 impact sound samples for inference in a Digital Audio Workstation (DAW), with latent representations exposed as user controls. The performance of the trained model is assessed using various objective evaluation metrics, revealing both the prospects and limitations of this approach. The results and contributions of this paper are discussed, with audio examples and source code made available.
Paper session 5 - Spatial Audio
Friday 1st September, 9:30 am – Lindsay Stuart Lecture Theatre
Session Chair, Emma Margetson (University of Greenwich)
Impact of an audio-haptic strap to augment immersion in VR video gaming: a pilot study
Antoine Bourachot (Cedric - CNAM), Tifanie Bouchara (LISN, U. Paris Saclay, CNRS) and Olivier Cornet (G4F)
With the development, for the general public, of haptic devices allowing to transform audio signals into vibrations, the question of their capacity to immerse users or players more is raised. This study aims to evaluate how haptic feedback associated with audio reinforces our immersion in a virtual space, more specifically, in VR video games. A preliminary study was carried out with an haptic belt: the Woojer’s Strap Edge. 17 participants had to play two VR shooting games, with and without haptic feedback, and then answer questionnaires between each session. A post-hoc questionnaire was proposed to get free feedback from the participants. Results show no significant differences between with and without haptic feedback conditions in the between-session questionnaires, however the final questionnaire reveals a very strong inter-subject variability when it comes to the perception and appreciation of haptic feedback.
Audiodice: an open hardware design of a distributed dodecahedron loudspeaker orchestra
Nicolas Bouillot (Lab148, CIRMMT and SAT), Thomas Piquet and Pierre Gilbert (Society for arts and technology)
We present a new speaker array composed of five spherical speakers with 12 independent channels each. The prototype is open source and design choices are motivated here. It is designed to be a flexible device allowing a wide range of use cases, as described in more detail in the paper: simultaneous rendering with surround speaker arrays, artistic installations and acoustical measurements. The sources in the repository include filter impulse response for frequency response correction. The measurement methodology, based on sine sweeps, is documented and allows the reader to reproduce the measurement and correction. Finally, the paper describes several use cases for which feedback is provided, and demonstrates the versatility, mobility, and ease of deployment provided by our proposed implementation.
Invoke: A Collaborative Virtual Reality Tool for Spatial Audio Production Using Voice-Based Trajectory Sketching
Thomas Deacon (University of Surrey) and Mathieu Barthet (Queen Mary University of London)
VR could transform creative engagement with spatial audio, given affordances for spatial visualisation and embodied interaction. But, issues exist addressing how to support collaboration for spatial audio production (SAP). Exploring this problem, we made a VR voice-based trajectory sketching tool, named Invoke, that allows two users to shape sonic ideas together. In this paper, thematic analysis is used to review two areas of a formative evaluation with expert users: (i) video analysis of VR interactions; and (ii) analysis of open questions about using the tool. Implications present new opportunities to explore co-creative VR tools for SAP.
Rhythmic Accuracy of 3D Spatial Interaction for Digital Music
Timothy Arterbury, G Michael Poor and Ana Arguelles (Baylor University)
This paper describes and evaluates the use of 3D spatial in-air body movement interaction for human control of music software. This technique was implemented, prior to this work, in an input device prototype, MoveMIDI, which allows users to initiate rhythmic musical events by hitting zones of 3D geometry in a virtual environment using position-tracked motion controllers. This work evaluates MoveMIDI’s spatial interaction strategy for music in a usability study measuring timing accuracy of participants performing rhythms using MoveMIDI in comparison to two other input devices. The study revealed spatial unsureness of participants using MoveMIDI due to visualization and haptic shortcomings. While results for the MoveMIDI prototype are not positive, points of improvement are revealed, and our methodology provides a novel comparison for input devices in the context of rhythmic performance accuracy.
FASS: Firefighter Audio Safety Systems
Alan Elliot (Seneca College) and Iain McGregor (Edinburgh Napier University)
A series of auditory cues were designed to assist firefighters with navigation and general safety in a fire emergency. Firefighters must maintain situational awareness at all times and this can be lost with disorientation, which is one of the main causes of injury and even death. Disorientation can be caused by restricted vision due to heavy smoke, a lack of familiarity with the surroundings as well as hearing and communication difficulties caused by the intensity of the fireground sounds. Five professional firefighters were interviewed to identify ways in which auditory affordances could be used to support their work. Existing sounds from both the emergency environment and those generated by firefighting equipment were assessed to determine their importance in maintaining situational awareness. Noise reduction technology was investigated, to assess its potential use in limiting the levels of noise exposure experienced. A series of auditory cues were designed to address the issues that were found using binaural spatialization and Augmented Reality methods. A prototype system was presented to firefighters to determine its effectiveness. The firefighters found that noise reduction would be effective in improving their situational awareness and ability to communicate effectively. Additionally, the firefighters found that spatially placed auditory cues had the potential to be effective in navigation and orientation in a fire emergency. The findings suggest that the use of noise reduction and auditory affordances have the potential to improve situational awareness for firefighters, increase safety and potentially save lives.
Barriers for Domestic Sound Zone Systems: Insights from a Four-Week Field Study
Rune Møberg Jacobsen, Kasper Fangel Skov, Mikael B. Skov and Jesper Kjeldskov (Aalborg University)
The term impact sound as referred to in this paper, can be broadly defined as the sudden burst of short-lasting impulsive noise generated by the collision of two objects. This type of sound effect is prevalent in multimedia productions. However, conventional methods of sourcing these materials are often costly in time and resources. This paper explores the potential of neural audio synthesis for generating realistic impact sound effects, targeted for use in multimedia such as films, games, and AR/VR. The designed system uses a Realtime Audio Variational autoEncoder (RAVE)  model trained on a dataset of over 3,000 impact sound samples for inference in a Digital Audio Workstation (DAW), with latent representations exposed as user controls. The performance of the trained model is assessed using various objective evaluation metrics, revealing both the prospects and limitations of this approach. The results and contributions of this paper are discussed, with audio examples and source code made available.
Study of auditory trajectories in virtual environments
Juan Camilo Arevalo Arboleda and Julián Villegas (University of Aizu)
A tool to study the apparent trajectories evoked by sounds (auditory trajectories) is presented. This tool is built with the aim of easing the task of the experimenters (building and analyzing interventions) and the task of the participants (reporting their opinions). By using infrared tracked controllers in a Virtual Reality environment, participants can freely describe the three-dimensional path evoked by a stimulus. The implemented tool also assists participants in recording trajectories by providing additional visual cues and feedback on the recorded data. A mock-up study is presented to demonstrate the benefits of the proposed system. Results from this study show that participants are able to accurately report elicited trajectories. While the implemented tool has limitations, such as the number of available blocks (only practice and main blocks), it could cover the needs of several laboratories. The tool is a valuable resource for researchers seeking to explore the perception and processing of auditory stimuli.