2009
DOI: 10.1007/s00422-009-0299-4
|View full text |Cite
|
Sign up to set email alerts
|

Categorization of environmental sounds

Abstract: Sounds in the natural environment are non-stationary, in that their spectral dynamics is time-dependent. We develop measures to analyze the spectral dynamics of environmental sound signals and find that they fall into two categories-simple sounds with slowly varying spectral dynamics and complex sounds with rapidly varying spectral dynamics. Based on our results and those from auditory processing we suggest rate of spectral dynamics as a possible scheme to categorize sound signals in the environment.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
33
0

Year Published

2011
2011
2019
2019

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 23 publications
(33 citation statements)
references
References 17 publications
0
33
0
Order By: Relevance
“…Using a measure of change in entropy over time (an index of spectral dynamics or rate of change of spectral structure), the animal vocalizations showed on average a greater degree of SSV measures (vocalizations = 3.21, action sounds = 1.47; two-tailed t-test, p < 0.01). Each of the above signal attributes were assessed and examined here because they have been implicated in auditory stream segregation and auditory object perception in earlier studies (Lewis et al, 2009, 2012; Reddy et al, 2009). We also assessed potential category-level differences in modulation power spectra (MPS), using techniques that have revealed spectro-temporal attributes of sound that may be processed along distinct pathways in auditory cortices in humans and macaques (Woolley et al, 2005; Cohen et al, 2007; Elliott and Theunissen, 2009; Kuśmierek et al, 2012; Herdener et al, 2013; Santoro et al, 2014).…”
Section: Methodsmentioning
confidence: 99%
“…Using a measure of change in entropy over time (an index of spectral dynamics or rate of change of spectral structure), the animal vocalizations showed on average a greater degree of SSV measures (vocalizations = 3.21, action sounds = 1.47; two-tailed t-test, p < 0.01). Each of the above signal attributes were assessed and examined here because they have been implicated in auditory stream segregation and auditory object perception in earlier studies (Lewis et al, 2009, 2012; Reddy et al, 2009). We also assessed potential category-level differences in modulation power spectra (MPS), using techniques that have revealed spectro-temporal attributes of sound that may be processed along distinct pathways in auditory cortices in humans and macaques (Woolley et al, 2005; Cohen et al, 2007; Elliott and Theunissen, 2009; Kuśmierek et al, 2012; Herdener et al, 2013; Santoro et al, 2014).…”
Section: Methodsmentioning
confidence: 99%
“…In vision, textures represent intermediate-level feature attributes that the system can use to define salient object boundaries or to fill in surfaces of perceived visual objects (Reppas et al, 1997; Kastner et al, 2000). Another form of higher order acoustic signal attribute related to textures includes spectral structure variation (SSV) measures, which quantifies changes in signal entropy over time (Reddy et al, 2009). Auditory object-like sounds tend to show relatively greater measures of SSV and lower mean entropy levels than scene-like sounds (Lewis et al, 2012), and an fMRI study indicated that the activation along the anterior STG regions for object-like sounds (Fig.…”
Section: Bottom-up Perspectives Of Vision and Hearing Modelsmentioning
confidence: 99%
“…Numerous “simple” acoustic signal attributes are known, or thought, to be represented in early cortical processing stages, including the filtering or extraction of signal features such as bandwidths, spectral shapes, onsets, and harmonic relationships, which together have a critical role in auditory stream segregation and formation, clustering operations, and sound organization (Medvedev et al, 2002, Nelken, 2004, Kumar et al, 2007, Elhilali and Shamma, 2008, Woods et al, 2010). Later stages are thought to represent processing that segregates spectro-temporal patterns associated with complex sounds, including the processing of acoustic textures, location cues, prelinguistic analysis of speech sounds (Griffiths and Warren, 2002, Obleser et al, 2007, Overath et al, 2010), and representations of auditory objects defined by their entropy and spectral structure variation (Reddy et al, 2009, Lewis et al, 2012). Subsequent cortical processing pathways, such as projections between posterior portions of the superior temporal gyri (STG) and STS, may integrate corresponding acoustic streams over longer time frames (Maeder et al, 2001, Zatorre et al, 2004, Griffiths et al, 2007, Leech et al, 2009, Goll et al, 2011, Teki et al, 2011), involving or leading to processing that may provide a greater sense of semantic meaning to the listener.…”
Section: Low-level Acoustic Signal Processing Of Vocalizationsmentioning
confidence: 99%