“…In computing vocal tract features, previous investigations have taken advantage of methods such as linear predictive cepstral coefficients (LPCCs) [12], perceptual linear prediction (PLP) [13] and mel-frequency cepstral coefficients (MFCCs) [14]. Regarding the classifier stage, several studies have explored conventional ML classifiers such as support vector machine (SVM) [4,15,16,17], random forest (RF) [18] and decision trees [16,19]. Due to recent advancements in deep learning, classical ML methods have been increasingly replaced by DL networks such as multilayer perceptron (MLP) [20], deep neural networks (DNNs) [21,22], long short-term memory (LSTM) networks [23,24], convolutional neural networks (CNNs) [25], combinations of CNN and MLP [4], and combinations of CNN and LSTM [26].…”