Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting

Arık, Sercan Ö.; Kliegl, Markus; Child, Rewon; Hestness, Joel; Gibiansky, Andrew; Chris, Fougner,; Prenger, Ryan; Coates, Adam

doi:10.48550/arxiv.1703.05390

Cited by 12 publications

(14 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Since then, multiple off-the-shelf CNN backbones have been widely applied to KWS tasks, such as deep residual network (ResNet) [2], separable CNN [3,4,5,6], temporal CNN [7] and SincNet [8]. There are also other efforts to boost performance of CNN models for KWS by combining other deep learning models, such as recurrent neural network (RNN) [9], bidirectional long short-term memory (BiLSTM) [10] and streaming layers [11]. However, although the off-the-shelf CNN backbones that existing KWS studies usually relied on have been demonstrated to be effective in image classification tasks, they are not specifically designed for KWS tasks and might not be the perfect architecture for KWS tasks.…”

Section: Introductionmentioning

confidence: 99%

Neural Architecture Search for Keyword Spotting

Mo¹,

Yu²,

Salameh³

et al. 2020

Interspeech 2020

View full text Add to dashboard Cite

Keyword spotting aims to identify specific keyword audio utterances. In recent years, deep convolutional neural networks have been widely utilized in keyword spotting systems. However, their model architectures are mainly based on off-the-shelf backbones such as VGG-Net or ResNet, instead of specially designed for the task. In this paper, we utilize neural architecture search to design convolutional neural network models that can boost the performance of keyword spotting while maintaining an acceptable memory footprint. Specifically, we search the model operators and their connections in a specific search space with Encoder-Decoder neural architecture optimization. Extensive evaluations on Google's Speech Commands Dataset show that the model architecture searched by our approach achieves a state-of-the-art accuracy of over 97%.

show abstract

Section: Introductionmentioning

confidence: 99%

Neural Architecture Search for Keyword Spotting

Mo¹,

Yu²,

Salameh³

et al. 2020

Interspeech 2020

View full text Add to dashboard Cite

show abstract

“…The Adam optimization is an efficient stochastic optimization that has been suggested and it combines the advantages of two popular methods: AdaGrad, which works well with sparse gradients, and RMSProp, which has an excellent performance in on-line and non-stationary settings. Recent works by Zhang et al (2019), Peng et al (2018), Bansal et al (2016) and Arik et al (2017) have presented and proven that Adam optimizer provides better performance than others in terms of both theoretical and practical perspectives. Therefore in this paper, we use Adam as the optimizer in our neural network simulations.…”

Section: Feedforward Neural Networkmentioning

confidence: 99%

Synthetic Dataset Generation of Driver Telematics

So¹,

Boucher²,

Valdez³

2021

Preprint

View full text Add to dashboard Cite

This article describes techniques employed in the production of a synthetic dataset of driver telematics emulated from a similar real insurance dataset. The synthetic dataset generated has 100,000 policies that included observations about driver's claims experience together with associated classical risk variables and telematics-related variables. This work is aimed to produce a resource that can be used to advance models to assess risks for usage-based insurance. It follows a three-stage process using machine learning algorithms. The first stage is simulating values for the number of claims as multiple binary classifications applying feedforward neural networks. The second stage is simulating values for aggregated amount of claims as regression using feedforward neural networks, with number of claims included in the set of feature variables. In the final stage, a synthetic portfolio of the space of feature variables is generated applying an extended SMOTE algorithm. The resulting dataset is evaluated by comparing the synthetic and real datasets when Poisson and gamma regression models are fitted to the respective data. Other visualization and data summarization produce remarkable similar statistics between the two datasets. We hope that researchers interested in obtaining telematics datasets to calibrate models or learning algorithms will find our work valuable.

show abstract

“…In this paper, we take motivation from [1] to design CNNs for KWS use case. We propose a CNN based approach since CNNs have shown better performance than DNNs and also has a smaller model size.…”

Section: Introductionmentioning

confidence: 99%