2016
DOI: 10.17743/jaes.2016.0007
|View full text |Cite
|
Sign up to set email alerts
|

Categorization of Broadcast Audio Objects in Complex Auditory Scenes

Abstract: This paper presents a series of experiments to determine a categorization framework for broadcast audio objects. Object-based audio is becoming an evermore important paradigm for the representation of complex sound scenes. However, there is a lack of knowledge regarding object level perception and cognitive processing of complex broadcast audio scenes. As categorization is a fundamental strategy in reducing cognitive load, knowledge of the categories utilized by listeners in the perception of complex scenes wi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
7

Relationship

3
4

Authors

Journals

citations
Cited by 13 publications
(12 citation statements)
references
References 25 publications
(30 reference statements)
0
12
0
Order By: Relevance
“…A five-cluster solution was chosen as this is the median number of groups that participants formed. The labels associated with each of the resulting clusters were interpreted by the researcher (see [26] for more details on this process). A dendrogram showing the resulting category labels is shown in Fig.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…A five-cluster solution was chosen as this is the median number of groups that participants formed. The labels associated with each of the resulting clusters were interpreted by the researcher (see [26] for more details on this process). A dendrogram showing the resulting category labels is shown in Fig.…”
Section: Methodsmentioning
confidence: 99%
“…This suggests that a taxonomy of objects based on narrative importance and producer intent could generate more intelligent personalized audio than is possible using binary dialogue/nondialogue definitions. Elicitation tests were carried out as part of the S3A project [25,26] in order to better understand audio object perception and the results of these tests have been utilized in the research documented in this paper.…”
Section: Object-based Audiomentioning
confidence: 99%
“…In the context of television audio for hearing-impaired users, Shirley and Oldfield [22] propose three categories of audio objects-speech content whose comprehension is critical, background noise that has been shown to be detrimental to both clarity and to perceived overall sound quality, and other non-speech sounds that are considered important to comprehension and/or enjoyment of the material. In a more complex categorisation of broadcast audio objects, Woodcock et al [27] used hierarchical agglomerative clustering to identify seven general categories, which relate to sounds indicating actions and movement, continuous and transient background sound, clear speech, non-diegetic music and effects, sounds indicating the presence of people, and prominent attention grabbing transient sounds. In the studies presented in this paper, a simple background/foreground categorisation is used; foreground objects are important to the narrative and generally localisable whereas background objects are non-critical to the narrative and generally more diffuse.…”
Section: Object-based Audiomentioning
confidence: 99%
“…It has been highlighted that the potentially large number of audio objects in a television program, and the fact that OBA allows hypothetical control over all objects, means that a better understanding of the role of these objects and how they can be grouped is required [6]. Work by Woodcock et al [125] has investigated how people cognitively categorize different parts of broadcast audio for a range of program material. They found that at least seven categories were perceived: continuous and transient background sound, clear speech, non-diegetic music and effects, sounds indicating the presence of people, sounds indicating actions and movement, and prominent attention-grabbing transient sounds.…”
Section: Audio Personalization For Accessibilitymentioning
confidence: 99%