Object-based audio is an emerging representation for audio content, where content is represented in a reproductionformat-agnostic way and thus produced once for consumption on many different kinds of devices. This affords new opportunities for immersive, personalized, and interactive listening experiences. This article introduces an end-to-end object-based spatial audio pipeline, from sound recording to listening. A high-level system architecture is proposed, which includes novel audiovisual interfaces to support object-based capture and listenertracked rendering, and incorporates a proposed component for objectification, i.e., recording content directly into an object-based form. Text-based and extensible metadata enable communication between the system components. An open architecture for object rendering is also proposed. The system's capabilities are evaluated in two parts. First, listener-tracked reproduction of metadata automatically estimated from two moving talkers is evaluated using an objective binaural localization model. Second, object-based scene capture with audio extracted using blind source separation (to remix between two talkers) and beamforming (to remix a recording of a jazz group), is evaluated with perceptually-motivated objective and subjective experiments. These experiments demonstrate that the novel components of the system add capabilities beyond the state of the art. Finally, we discuss challenges and future perspectives for object-based audio workflows.
When validating systems that use headphones to synthesize virtual sound sources, a direct comparison between virtual and real sources is sometimes needed. This paper considers the passive influence of headphones on the sound transmission and perception of external loudspeaker sources, for which physical measurements and behavioral data have been obtained. Physical measurements of the effect of a number of headphone models are given and analyzed using an auditory filter bank and binaural cue extraction. These highlighted that all of the headphones had an effect on localization cues and repositioning had a measurable effect. A localization test was undertaken using one of the best performing headphones from the measurements. It was found that the presence of the headphones caused a small increase in localization error and that the process of judging source location was different, highlighting a possible increase in the complexity of the localization task. INTRODUCTIONThe use of binaural rendering is popular in a number of audio applications-from hearing research [1-3] to entertainment [4,5]. In each application, the specific requirements for the performance of a binaural system will be slightly different although generally, the aim is to induce the perception of intended auditory events as accurately as possible. Designing an assessment methodology that validates a binaural system within its intended application is often a difficult task. A common metric for a binaural system is the ability to produce a virtual sound source that is indistinguishable from a real sound source. Indirect comparisons have been investigated, for example, by Minnaar et al. [6] and Møller et al. [7,8] in which non-dynamic binaural simulation and real loudspeaker localization tasks were considered in separated experiments. However, for direct comparisons where real and virtual loudspeakers are presented simultaneously, the validation of headphone-based binaural systems against a real loudspeaker reference can be problematic. The listener must wear the headphones throughout the experiment, which will affect the sound transmission from the external loudspeakers. A number of discrimination studies have involved direct comparison of real sources with headphone-delivered virtual sources [9-13] as well as some recent localization tests [14,15] and loudness equalization studies [16,17]. The passive use of headphones may have a significant effect on the perception of the external loudspeaker and therefore cause an unknown and possibly directionally dependent bias. Hartmann and Wittenberg [10] noted that wearing headphones appeared to affect the listeners' ability to distinguish between front and back, although they also state that they were not aware of its effect on experiments in the azimuthal plane. To highlight the importance of the problem, Erbes et al.[18] presented work on the development of an advanced headphone system specifically for the field of binaural reproduction.This study investigates whether headphones mounted on a listener will ...
This essay addresses two questions on the topic of podcast innovation. The first, ‘What is a podcast?’ is answered via a review of the literature, investigating podcasting history and its evolution. The definition of podcasting arising from this analysis – centring on episodic audio, convenient both to produce and experience – takes into account recent changes, providing an up-to-date description of the term, useful for further research on the topic. It is also required to answer our second question: ‘How do we design new ways to produce and listen to podcasts without denaturing the medium?’ By reflecting on the essential features of podcasting and the necessity for innovation in this interdisciplinary medium, a framework of six-tensions is proposed as a means of grounding and potentially boosting innovation. Answering these questions could prove valuable for the future of podcasting, hypothesising a basis for reflection and development in both academia and industry.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.