Leonid Velikovich scite author profile

Leonid Velikovich

5Publications

27Citation Statements Received

90Citation Statements Given

How they've been cited

How they cite others

Affiliations

Google (United States)

Publications

Order By: Most citations

Semantic Lattice Processing in Contextual Automatic Speech Recognition for Google Assistant

Velikovich

Williams

Scheiner

et al. 2018

View full text Add to dashboard Cite

Recent interest in intelligent assistants has increased demand for Automatic Speech Recognition (ASR) systems that can utilize contextual information to adapt to the user's preferences or the current device state. For example, a user might be more likely to refer to their favorite songs when giving a "music playing" command or request to watch a movie starring a particular favorite actor when giving a "movie playing" command. Similarly, when a device is in a "music playing" state, a user is more likely to give volume control commands. In this paper, we explore using semantic information inside the ASR word lattice by employing Named Entity Recognition (NER) to identify and boost contextually relevant paths in order to improve speech recognition accuracy. We use broad semantic classes comprising millions of entities, such as songs and musical artists, to tag relevant semantic entities in the lattice. We show that our method reduces Word Error Rate (WER) by 12.0% relative on a Google Assistant "media playing" commands test set, while not affecting WER on a test set containing commands unrelated to media.

show abstract

Semantic model for fast tagging of word lattices

Velikovich

2016

View full text Add to dashboard Cite

Contextual Recovery of Out-of-Lattice Named Entities in Automatic Speech Recognition

Serrino

Velikovich

Aleksic

et al. 2019

View full text Add to dashboard Cite

As voice-driven intelligent assistants become commonplace, adaptation to user context becomes critical for Automatic Speech Recognition (ASR) systems. For example, ASR systems may be expected to recognize a user's contact names containing improbable or out-of-vocabulary (OOV) words. We introduce a method to identify contextual cues in a firstpass ASR system's output and to recover out-of-lattice hypotheses that are contextually relevant. Our proposed module is agnostic to the architecture of the underlying recognizer, provided it generates a word lattice of hypotheses; it is sufficiently compact for use on device. The module identifies subgraphs in the lattice likely to contain named entities (NEs), recovers phoneme hypotheses over corresponding time spans, and inserts NEs that are phonetically close to those hypotheses. We measure a decrease in the mean word error rate (WER) of word lattices from 11.5% to 4.9% on a test set of NEs.

show abstract

Improving Entity Recall in Automatic Speech Recognition with Neural Embeddings

Rondon

Caseiro

et al. 2021

View full text Add to dashboard Cite

Automatic speech recognition (ASR) systems often have difficulty recognizing long-tail entities such as contact names and local restaurant names, which usually do not occur, or occur infrequently, in the system's training data. In this work, we present a method which uses learned text embeddings and nearest neighbor retrieval within a large database of entity embeddings to correct misrecognitions. Our text embeddings are produced by a neural network trained so that the embeddings of acoustically confusable phrases have low cosine distances. Given the embedding of the text of a potential entity misrecognition and a precomputed database containing entities and their corresponding embeddings, we use fast, scalable nearest neighbor retrieval algorithms to find candidate corrections within the database. The inserted candidates are then scored using a function of the original text's cost in the lattice and the distance between the embedding of the original text and the embedding of the candidate correction. Using this lattice augmentation techique, we demonstrate a 46% reduction in word error rate (WER) and 46% reduction in oracle word error rate (OWER) on an evaluation set with popular film queries.

show abstract

Garbage modeling for on-device speech recognition

Gysel

Velikovich

McGraw

et al. 2015

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.