Survey of Deep Learning Paradigms for Speech Processing

Bhangale, Kishor; Kothandaraman, Mohanaprasad

doi:10.1007/s11277-022-09640-y

Cited by 49 publications

(13 citation statements)

References 134 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Also, it provides the pollution free nature of the HEV by selection of pollution free sources for the powering the HEV. In recent years, deep learning algorithms have shown noteworthy contributions in various signal processing applications because of their faster conversions, high accuracy, reliability, and effectiveness [6], [7], [12]. In the future, various deep learning-based systems can be employed for driving and vehicle condition data augmentation to create the synthetic data for the simulation using available limited datasets [8], [9], [13].…”

Section: ░ 4 Simulation Results and Discussionmentioning

confidence: 99%

Grey Wolf Optimization Based Energy Management Strategy for Hybrid Electrical Vehicles

Gadge

Pahariya

2022

IJEER

View full text Add to dashboard Cite

Electric vehicles (EVs) are seen as a necessary component of transportation's future growth. However, the performance of batteries related to power density and energy density restricts the adoption of electric vehicles. To make the transition from a conventional car to a pure electric vehicle (PEV), a Hybrid Electric Vehicle's (HEV) Energy Management System (EMS) is crucial. The HEVs are often powered with hybrid electrical sources, therefore it is important to select the optimal power source to improve the HEV performance, minimize the fuel cost and minimize hydrocarbon and nitrogen oxides emission. This paper presents the Grey Wolf Optimization (GWO) algorithm for the control of the power sources in the HEVs based on power requirement and economy. The proposed GWO-based EMS provides optimized switching of the power sources and economical and pollution free control of HEV.

show abstract

Section: ░ 4 Simulation Results and Discussionmentioning

confidence: 99%

Grey Wolf Optimization Based Energy Management Strategy for Hybrid Electrical Vehicles

Gadge

Pahariya

2022

IJEER

View full text Add to dashboard Cite

show abstract

“…In CNN‐based speech processing, the algorithm first extracts the Mel‐frequency cepstral coefficients (MFCC) features from the original speech data and takes them as an input (Figure 15e). [ 65,139,154 ] Then, the convolutional layer extracts various features from the input through the learnable filters, that is, the kernels, to create feature maps. The pooling layer follows the convolutional layer to decrease the size of the convoluted feature map to reduce the computational costs.…”

Section: For Soft Acoustic/vibration Sensorsmentioning

confidence: 99%

“…A comprehensive review of all ML and DL techniques used for speech processing is beyond the scope of this paper, and they have been already covered in a number of existing reviews. [137][138][139] Therefore, in this section, we will primarily focus on the ML and DL algorithms which were successfully used together with soft acoustic/vibration sensors to conduct speech processing in the literature thus far. Both traditional ML algorithms (e.g., linear discriminant analysis [LDA], random forest [RF], and Gaussian mixture model [GMM]) and DL algorithms (e.g., FNN and CNN) have demonstrated excellent capability to process the signals coming from soft acoustic/vibration sensors to perform speech recognition.…”

Section: Algorithms For Soft Acoustic/vibration Sensorsmentioning

confidence: 99%

Emerging Trends in Soft Electronics: Integrating Machine Intelligence with Soft Acoustic/Vibration Sensors

Lee

Cho

2023

Advanced Materials

View full text Add to dashboard Cite

In the last decade, soft acoustic/vibration sensors have gained tremendous research interest due to their unique ability to detect broadband acoustic/vibration stimuli, potentializing futuristic applications including voice biometrics, voice‐controlled human–machine‐interfaces, electronic skin, and skin‐mountable healthcare devices. Importantly, to benefit most from these sensors, it is inevitable to use machine learning (ML) to process their output signals; with ML, a more accurate and efficient interpretation of original data is possible. This paper is dedicated to offering an overview of recent advances empowering the development of soft acoustic/vibration sensors and their signal processing using ML. First, the key performance parameters of the sensors are discussed. Second, popular transduction mechanisms for the sensors are addressed, followed by an in‐depth overview of each type, covering materials used, structural designs, and sensing performances. Third, potential applications of the sensors are elaborated and fourth, a thorough discussion on ML is conducted, exploring different types of ML, specific ML algorithms suitable for processing acoustic/vibration signals, and current trends in ML‐assisted applications. Finally, the challenges and potential opportunities in soft acoustic/vibration sensor and ML research are revealed to offer new insights into future prospects in these fields.

show abstract

“…A comprehensive review with background introduction and formulation of speech separation and components of supervised separation, i.e., learning machines, training targets, and acoustic features, have been introduced with a description of monaural speech enhancement, speaker separation, and speech de-reverberation as well as multimicrophone techniques in [17]. The articles [17], [32], [33], [34] presented interesting reviews of deep learning applied to various problems of speech processing. Nevertheless, these review articles presented speaker separation using deep learning in the T-F domain only in a short portion of the overview.…”

Section: Introductionmentioning

confidence: 99%

State-of-the-Art Analysis of Deep Learning-Based Monaural Speech Source Separation Techniques

2023

View full text Add to dashboard Cite

The monaural speech source separation problem is an important application in the signal processing field. But recent interaction of deep learning algorithms with signal processing achieves remarkable performance improvement for speech source separation problems. This paper explores the numerous state-of-the-art deep learning-based monaural speech source separation algorithms in the timefrequency (T-F), time, and hybrid domains. The motivation, algorithm, and framework of different deep learning models for monaural speech source separation are analyzed. The benchmarked algorithms in the T-F domain can be categorized as deep neural networks (DNN), clustering, permutation, multi-task learning, computational auditory sense analysis (CASA), and phase reconstruction-based techniques, whereas the state-of-the-art time-domain approaches can be categorized as CNN, RNN, multi-scale fusion (MSF), and transformer-based techniques. The end-to-end post filter (E2EPF) is a hybrid algorithm combining T-F and time-domain works to achieve enhanced results. Time-domain models have shown improvement in separation performance compared to the T-F and hybrid domain models with small model sizes. Methods in T-F, time, and hybrid domains are compared using 𝑆𝐷𝑅, 𝑆𝐼 − 𝑆𝐷𝑅, 𝑆𝐼 − 𝑆𝑁𝑅, PESQ, and 𝑆𝑇𝑂𝐼 as quality assessment metrics on some benchmark datasets.

show abstract

Survey of Deep Learning Paradigms for Speech Processing

Cited by 49 publications

References 134 publications

Grey Wolf Optimization Based Energy Management Strategy for Hybrid Electrical Vehicles

Grey Wolf Optimization Based Energy Management Strategy for Hybrid Electrical Vehicles

Emerging Trends in Soft Electronics: Integrating Machine Intelligence with Soft Acoustic/Vibration Sensors

State-of-the-Art Analysis of Deep Learning-Based Monaural Speech Source Separation Techniques

Contact Info

Product

Resources

About