This paper proposes a novel approach that uses deep neural networks for classifying imagined speech, significantly increasing the classification accuracy. The proposed approach employs only the EEG channels over specific areas of the brain for classification, and derives distinct feature vectors from each of those channels. This gives us more data to train a classifier, enabling us to use deep learning approaches. Wavelet and temporal domain features are extracted from each channel. The final class label of each test trial is obtained by applying a majority voting on the classification results of the individual channels considered in the trial. This approach is used for classifying all the 11 prompts in the KaraOne dataset of imagined speech. The proposed architecture and the approach of treating the data have resulted in an average classification accuracy of 57.15%, which is an improvement of around 35% over the stateof-the-art results.Index Terms-imagined speech, brain-computer interaction, deep neural network, commone spatial pattern, EEG
Automatic and accurate detection of the closure-burst transition events of stops and affricates serves many applications in speech processing. A temporal measure named the plosion index is proposed to detect such events, which are characterized by an abrupt increase in energy. Using the maxima of the pitch-synchronous normalized cross correlation as an additional temporal feature, a rule-based algorithm is designed that aims at selecting only those events associated with the closure-burst transitions of stops and affricates. The performance of the algorithm, characterized by receiver operating characteristic curves and temporal accuracy, is evaluated using the labeled closure-burst transitions of stops and affricates of the entire TIMIT test and training databases. The robustness of the algorithm is studied with respect to global white and babble noise as well as local noise using the TIMIT test set and on telephone quality speech using the NTIMIT test set. For these experiments, the proposed algorithm, which does not require explicit statistical training and is based on two one-dimensional temporal measures, gives a performance comparable to or better than the state-of-the-art methods. In addition, to test the scalability, the algorithm is applied on the Buckeye conversational speech corpus and databases of two Indian languages.
Detection of QRS serves as a first step in many automated ECG analysis techniques. Motivated by the strong similarities between the signal structures of an ECG signal and the integrated linear prediction residual (ILPR) of voiced speech, an algorithm proposed earlier for epoch detection from ILPR is extended to the problem of QRS detection. The ECG signal is pre-processed by high-pass filtering to remove the baseline wandering and by half-wave rectification to reduce the ambiguities. The initial estimates of the QRS are iteratively obtained using a non-linear temporal feature, named the dynamic plosion index suitable for detection of transients in a signal. These estimates are further refined to obtain a higher temporal accuracy. Unlike most of the high performance algorithms, this technique does not make use of any threshold or differencing operation. The proposed algorithm is validated on the MIT-BIH database using the standard metrics and its performance is found to be comparable to the state-of-the-art algorithms, despite its threshold independence and simple decision logic.Index Terms-DPI algorithm, dynamic plosion index, ECG signal, GCI detection, QRS detection, threshold free.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.