1999
DOI: 10.1109/72.761727
|View full text |Cite
|
Sign up to set email alerts
|

Separation of speech from interfering sounds based on oscillatory correlation

Abstract: Abstract-A multistage neural model is proposed for an auditory scene analysis task-segregating speech from interfering sound sources. The core of the model is a two-layer oscillator network that performs stream segregation on the basis of oscillatory correlation. In the oscillatory correlation framework, a stream is represented by a population of synchronized relaxation oscillators, each of which corresponds to an auditory feature, and different streams are represented by desynchronized oscillator populations.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
10
0

Year Published

1999
1999
2023
2023

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 232 publications
(12 citation statements)
references
References 42 publications
0
10
0
Order By: Relevance
“…First, the proposed system is evaluated in the pitch range estimation process with utterances chosen from the Lee's database [23] and a corpus of 100 mixtures of speech and interference [24], commonly used for CASA research, see, e.g., [13,25,26]. The corpus contains utterances from both male and female speakers.…”
Section: Pitch Range Estimationmentioning
confidence: 99%
“…First, the proposed system is evaluated in the pitch range estimation process with utterances chosen from the Lee's database [23] and a corpus of 100 mixtures of speech and interference [24], commonly used for CASA research, see, e.g., [13,25,26]. The corpus contains utterances from both male and female speakers.…”
Section: Pitch Range Estimationmentioning
confidence: 99%
“…The problem of discriminating music from audio has increasingly become very important as automatic audio signal recognition (ASR) systems and it has been increasingly applied in the domain of real-world multimedia [7]. Human's ear can easily distinguish audio without any influence of the mixed music [8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23]. Due to the new methods of the analysis and the synthesis processing of audio signals, the processing of musical signals has gained particular weight [16,24], and therefore, the classical sound analysis methods may be used in the processing of musical signals [25][26][27][28].…”
Section: Introductionmentioning
confidence: 99%
“…These systems extract relevant cues from a scene, such as its spectral content, spatial structure as well as temporal dynamics; hence allowing sound events with uncorrelated acoustic behavior to occupy different subspaces in the analysis stage. These models are quite effective in replicating perceptual results of stream segregation especially using simple tone and noise stimuli [3237]. Some models also extend beyond early acoustic features to examine feature binding mechanisms that can be used as an effective strategy in segregating wide range of stimuli from simple tone sequences to spectro-temporally complex sounds like speech and music [3840].…”
Section: Introductionmentioning
confidence: 99%