This paper presents an optimized speech compression algorithm using discrete wavelet transform, and its real time implementation on fixed-point digital signal processor (DSP). The optimized speech compression algorithm presents the advantages to ensure low complexity, low bit rate and achieve high speech coding efficiency, and this by adding a voice activity detector (VAD) module before the application of the discrete wavelet transform. The VAD module avoids the computation of the discrete wavelet coefficients during the inactive voice signal. In addition, a real-time implementation of the optimized speech compression algorithm is performed using fixed-point processor. The optimized and the original algorithms are evaluated and compared in terms of CPU time
Abstract-This paper presents a simulation and hardware implementation of a new audio compression scheme based on the fast Hartley transform in combination with a new modified run length encoding. The proposed algorithm consists of analyzing signals with fast Hartley Transform and then thresholding the obtained coefficients below a given threshold which are then encoded using a new approach of run length encoding. The thresholded coefficients are, finally, quantized and coded into binary stream. The experimental results show the ability of the fast Hartley transform to compress audio signals. Indeed, it concentrates the signal energy in a few coefficients and demonstrates the ability of the new approach of run length encoding to increase the compression factor. The results of the current work are compared with wavelet based compression by using objective assessments namely CR, SNR, PSNR and NRMSE. This study shows that the fast Hartley transform is more appropriate than wavelets one since it offers a higher compression ratio and a better speech quality. In addition, we have tested the audio compression system on DSP processor TMS320C6416.This test shows that our system fits with the real-time requirements and ensures a low complexity. The perceptual quality is evaluated with the Mean Opinion Score (MOS).
Automatic speech recognition is one of the most active research areas as it offers a dynamic platform for humanmachine interaction. The robustness of speech recognition systems is often degraded in real time applications, which are often accompanied by environmental noises. In this work, we have investigated the efficiency of combining wave atoms transform (WAT) with Mel-Frequency Cepstral Coefficients (MFCC) using Support Vector Machine (SVM) as classifier in different noisy conditions. A full experimental evaluation of the proposed model has been conducted using Arabic speech database (ARADIGIT) and corrupted with "NOISEUS database" noises at different levels of SNR ranging from-5 to 15dB. The results of Simulation have indicated that the proposed algorithm has improved the recognition rate (99.9%) at 15 dB of SNR. A comparative study was conducted by applying the proposed WAT-MFCC features to multilayer perceptron (MLP) and hidden Markov model (HMM) in order to prove the efficiency and the robustness of the proposed system.
This paper presents an optimization, and a real-time implementation of a wavelet based speech compression system in STM32F4 discovery card. The optimization is done on the one hand by considering a Voice Activity Detection (VAD) to reduce the complexity and on the other hand by using a new quantization approach that codes each sample with fewer bits. The performance of the embedded audio codec is evaluated with a test technique called Processor-in-the-Loop (PIL) and using objective measures that can predict the perceived quality of the signal, namely SNR, PSNR and MSE. The compression efficiency is measured with the compression factor (CR). This research highlights the importance of the proposed optimizations. Indeed, they increase the CR without damaging the voice quality. The practical study shows that the proposed system meets the temporal and material requirements. Voice clarity is assessed with the Mean Opinion Score (MOS).
This paper proposes a new adaptive speech compression system based on discrete wave atoms transform. First, the signal is decomposed on wave atoms, then wave atom coefficients are truncated using a new adaptive thresholding which depends on the SNR estimation. The thresholded coefficients are quantized using Max Lloyd scalar quantizer. Besides, they are encoded using zero run length encoding followed by Huffman coding. Numerous simulations are performed to prove the robustness of our approach. The results of current work are compared with wavelet based compression by using objective criteria, namely CR, SNR, PSNR and NRMSE. This study shows that the wave atoms transform is more appropriate than wavelets transform since it offers a higher compression ratio and a better speech quality.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.