This paper presents a novel method to recognize inharmonic and transient bird sounds efficiently. The recognition algorithm consists of feature extraction using wavelet decomposition and recognition using either supervised or unsupervised classifier. The proposed method was tested on sounds of eight bird species of which five species have inharmonic sounds and three reference species have harmonic sounds. Inharmonic sounds are not well matched to the conventional spectral analysis methods, because the spectral domain does not include any visible trajectories that computer can track and identify. Thus, the wavelet analysis was selected due to its ability to preserve both frequency and temporal information, and its ability to analyze signals which contain discontinuities and sharp spikes. The shift invariant feature vectors calculated from the wavelet coefficients were used as inputs of two neural networks: the unsupervised self-organizing map (SOM) and the supervised multilayer perceptron (MLP). The results were encouraging: the SOM network recognized 78% and the MLP network 96% of the test sounds correctly.
Electroencephalogram spindle patterns corresponding to two different phenomena-natural sleep and propofol anesthesia-are compared. The spindles are extracted from 5 overnight sleep recordings and 10 recordings of deep propofol anesthesia. Mean frequency, angle of the trend in instant frequency as well as 3 nonlinear parameters-spectral entropy, approximate entropy, and Higuchi fractal dimension- are calculated to characterize the spindle waveforms. Using the Wilcoxon rank sum test with significance level of 0.01, all the mentioned features, except approximate entropy, differ significantly for the two types of EEG spindles.
A nonlinear Hammerstein model is proposed for coding speech signals. Using Tsay's nonlinearity test, we first show that the great majority of speech frames contain nonlinearities (over 80% in our test data) when using 20-millisecond speech frames. Frame length correlates with the level of nonlinearity: the longer the frames the higher the percentage of nonlinear frames. Motivated by this result, we present a nonlinear structure using a frame-by-frame adaptive identification of the Hammerstein model parameters for speech coding. Finally, the proposed structure is compared with the LPC coding scheme for three phonemes /a/, /s/, and /k/ by calculating the Akaike information criterion of the corresponding residual signals. The tests show clearly that the residual of the nonlinear model presented in this paper contains significantly less information compared to that of the LPC scheme. The presented method is a potential tool to shape the residual signal in an encode-efficient form in speech coding.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.