Automatic recognition of isolated spoken digits is one of the most challenging tasks in the area of Automatic Speech Recognition. In this paper, Database Development and Automatic Speech Recognition of Isolated Pashto Spoken Digits from Sefer (0) to Naha (9) has been presented. A number of 50 individual Pashto native speakers (25 male and 25 female) of different ages, ranging from 18 to 60 years, were involved to utter from Sefer (0) to Naha (9) digits separately. Sony PCM-M 10 linear recorder is used for recoding purpose in the office and home in noise free environment. Adobe audition version 1.0 is used to split the audio of digits into individual digits and result is saved in .wav format. Mel frequency cepstral coefficients is used to extract speech features. K nearest neighbor classifier is used for the first time up to author knowledge in Pashto language to classify the features of speech and compare its accuracy with linear discriminate analysis. The experimental results are evaluated, and the overall average recognition exactitude of 76.8 % is obtained.
Motivation. Immunoglobulin proteins (IGP) (also called antibodies) are glycoproteins that act as B-cell receptors against external or internal antigens like viruses and bacteria. IGPs play a significant role in diverse cellular processes ranging from adhesion to cell recognition. IGP identifications via the in-silico approach are faster and more cost-effective than wet-lab technological methods. Methods. In this study, we developed an intelligent theoretical deep learning framework, “IGPred-HDnet” for the discrimination of IGPs and non-IGPs. Three types of promising descriptors are feature extraction based on graphical and statistical features (FEGS), amphiphilic pseudo-amino acid composition (Amp-PseAAC), and dipeptide composition (DPC) to extract the graphical, physicochemical, and sequential features. Next, the extracted attributes are evaluated through machine learning, i.e., decision tree (DT), support vector machine (SVM), k-nearest neighbour (KNN), and hierarchical deep network (HDnet) classifiers. The proposed predictor IGPred-HDnet was trained and tested using a 10-fold cross-validation and independent test. Results and Conclusion. The success rates in terms of accuracy (ACC) and Matthew’s correlation coefficient (MCC) of IGPred-HDnet on training and independent dataset (Dtrain Dtest) are ACC = 98.00%, 99.10%, and MCC = 0.958, and 0.980 points, respectively. The empirical outcomes demonstrate that the IGPred-HDnet model efficacy on both datasets using the novel FEGS feature and HDnet algorithm achieved superior predictions to other existing computational models. We hope this research will provide great insights into the large-scale identification of IGPs and pharmaceutical companies in new drug design.
Sentiment analysis is the computational study of reviews, emotions, and sentiments expressed in the text. In the past several years, sentimental analysis has attracted many concerns from industry and academia. Deep neural networks have achieved significant results in sentiment analysis. Current methods mainly focus on the English language, but for minority languages, such as Roman Urdu that has more complex syntax and numerous lexical variations, few research is carried out on it. In this paper, for sentiment analysis of Roman Urdu, the novel "Self-attention Bidirectional LSTM (SA-BiLSTM)" network is proposed to deal with the sentence structure and inconsistent manner of text representation. This network addresses the limitation of the unidirectional nature of the conventional architecture. In SA-BiLSTM, Self-Attention takes charge of the complex formation by correlating the whole sentence, and BiLSTM extracts context representations to tackle the lexical variation of attended embedding in preceding and succeeding directions. Besides, to measure and compare the performance of SA-BiLSTM model, we preprocessed and normalized the Roman Urdu sentences. Due to the efficient design of SA-BiLSTM, it can use fewer computation resources and yield a high accuracy of 68.4% and 69.3% on preprocessed and normalized datasets, respectively, which indicate that SA-BiLSTM can achieve better efficiency as compared with other state-of-the-art deep architectures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.