Determining the best Acoustic Features for Smoker Identification

Ma, Zhizhong; Qiu, Yuanhang; Feng, Huanqing; Wang, Ruili; Chu, Joanna Ting Wai; Bullen, Christopher

doi:10.1109/icassp43922.2022.9747712

Cited by 5 publications

(2 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Similarly, a parameterized convolutional neural network (CNN) was used for acoustic modeling from raw waveform for the dysarthria speech recognition tasks [17]. In [18], a SincNet-based speech feature learning method was proposed to achieve automatic smoker identification tasks. It can be found that investigating learnable frontends has drawn a lot of attention from researchers and made significant progress in the field of speech process-ing.…”

Section: Related Workmentioning

confidence: 99%

Speech Recognition for Air Traffic Control via Feature Learning and End-to-End Training

Fan

Hua

Lin

et al. 2023

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

In this work, we propose a new automatic speech recognition (ASR) system based on feature learning and an end-to-end training procedure for air traffic control (ATC) systems. The proposed model integrates the feature learning block, recurrent neural network (RNN), and connectionist temporal classification loss to build an end-to-end ASR model. Facing the complex environments of ATC speech, instead of the handcrafted features, a learning block is designed to extract informative features from raw waveforms for acoustic modeling. Both the SincNet and 1D convolution blocks are applied to process the raw waveforms, whose outputs are concatenated to the RNN layers for the temporal modeling. Thanks to the ability to learn representations from raw waveforms, the proposed model can be optimized in a complete end-to-end manner, i.e., from waveform to text. Finally, the multilingual issue in the ATC domain is also considered to achieve the ASR task by constructing a combined vocabulary of Chinese characters and English letters. The proposed approach is validated on a multilingual real-world corpus (ATCSpeech), and the experimental results demonstrate that the proposed approach outperforms other baselines, achieving a 6.9% character error rate.

show abstract

Section: Related Workmentioning

confidence: 99%

Speech Recognition for Air Traffic Control via Feature Learning and End-to-End Training

Fan

Hua

Lin

et al. 2023

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

show abstract

“…They investigate the performance of four acoustic feature sets or representations extracted using three feature extraction or learning approaches: (i) handcrafted feature sets, including the extended Geneva Minimalistic Acoustic Parameter Set and the Computational Paralinguistics Challenge Set; (ii) the Bag-of-Audio-Words representations; (iii) the neural representations extracted from raw waveform signals by SincNet. Experimental results show that: (i) SincNet feature representations are the most effective for smoker identification and outperform the MFCC baseline features by 16% in absolute accuracy; (ii) the performance of hand-crafted feature sets and the Bag-of-Audio-Words representations rely on the scale of the dimensions of feature vectors [10].…”

Section: Previous Studiesmentioning

confidence: 99%

Review "Smoker/Non-Smoker Classification of People Using a Speech Signal"

Khudhur Zaal,

Faisal Mohammad

2023

IRJIET

View full text Add to dashboard Cite

Speech is a behavioral biometric that can reveal a person's age, gender, race, and emotional state. The speech signal may also be used to ascertain a person's behavior, such as whether or not they smoke or take drugs. One of the topics that is frequently studied in the field of speech technology is the smoking habits of speakers. Over the past years, a lot of research has been done in this area, but little progress has been made in this field. As deep learning techniques have advanced in most machine learning fields, they have replaced earlier research techniques for speech recognition and verification. The most cutting-edge method for confirming and recognizing a speaker's identity is currently deep learning. This study's objective is to analyze research that uses speech signals and artificial intelligence to distinguish smokers from non-smokers. Every speech recognition system uses a variety of algorithms to convert sound waves into information that can be interpreted and processed by the system, which then generates an output that can be used as needed.

show abstract

MobileACNet: ACNet-Based Lightweight Model for Image Classification

Jiang

Zong

et al. 2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Determining the best Acoustic Features for Smoker Identification

Cited by 5 publications

References 29 publications

Speech Recognition for Air Traffic Control via Feature Learning and End-to-End Training

Speech Recognition for Air Traffic Control via Feature Learning and End-to-End Training

Review "Smoker/Non-Smoker Classification of People Using a Speech Signal"

MobileACNet: ACNet-Based Lightweight Model for Image Classification

Contact Info

Product

Resources

About