Pathological speech usually refers to speech distortion resulting from illness or other biological insults. The assessment of pathological speech plays an important role in assisting the experts, while automatic evaluation of speech intelligibility is difficult because it is usually nonstationary and mutational. In this paper, we carry out an independent innovation of feature extraction and reduction, and we describe a multigranularity combined feature scheme which is optimized by the hierarchical visual method. A novel method of generating feature set based on S-transform and chaotic analysis is proposed. There are BAFS (430, basic acoustics feature), local spectral characteristics MSCC (84, Mel S-transform cepstrum coefficients), and chaotic features (12). Finally, radar chart and F-score are proposed to optimize the features by the hierarchical visual fusion. The feature set could be optimized from 526 to 96 dimensions based on NKI-CCRT corpus and 104 dimensions based on SVD corpus. The experimental results denote that new features by support vector machine (SVM) have the best performance, with a recognition rate of 84.4% on NKI-CCRT corpus and 78.7% on SVD corpus. The proposed method is thus approved to be effective and reliable for pathological speech intelligibility evaluation.
Speech emotion recognition is a very important speech technology. In this paper, Mel Frequency Cepstral Coefficients (MFCC) has been used to represent speech signal as emotional features. MFCCs plus energy of an utterance are used as the input for Support Vector Machine. Support Vector Machine (SVM) has been profoundly successful in the area of pattern recognition. In the recent years there has been use of SVM for speech recognition. Many kinds of kernel functions are available for SVM to map an input space problem to high dimensional spaces. We lack guidelines on choosing a better kernel with optimized parameters of SVM. Some kernels are better for some questions, but worse for other questions. Which is better is unknown for speech emotion recognition, thus the thesis studies the SVM classifier and proposes methods used to select a better kernel with optimized parameters. The new method we proposed in this paper can more efficiently gain optimized parameters than common methods. In order to improve recognition accuracy rate of the speech emotion recognition system, a speech emotion recognition based on optimized support vector machine is proposed. Experimental studies are performed over the HIT Emotional Speech Database established by Speech Processing Lab in School of Computer Science and Technology at HIT. The experiment result shows that the speech emotion recognition based on optimized SVM can improve the performance of the emotion recognition system effectively
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.