An important goal at Google is to make spoken access ubiquitously available. Achieving ubiquity requires two things: availability (i.e., built into every possible interaction where speech input or output can make sense) and performance (i.e., works so well that the modality adds no friction to the interaction).This chapter is a case study of the development of Google Search by Voice -a step toward our long-term vision of ubiquitous access. While the integration of speech input into Google search is a significant step toward more ubiquitous access, it has posed many problems in terms of the performance of core speech technologies and the design of effective user interfaces. Work is ongoing and no doubt the problems are far from solved. Nonetheless, we have at the minimum achieved a level of performance showing that usage of voice search is growing rapidly, and that many users do indeed become repeat users.
This paper introduces new methods based on exponential families for modeling the correlations between words in text and speech. While previous work assumed the effects of word co-occurrence statistics to be constant over a window of several hundred words, we show that their influence is nonstationary on a much smaller time scale. Empirical data drawn from English and Japanese text, as well as conversational speech, reveals that the "attraction" between words decays exponentially, while stylistic and syntactic contraints create a "repulsion" between words that discourages close co-occurrence. We show that these characteristics are well described by simple mixture models based on twostage exponential distributions which can be trained using the EM algorithm. The resulting distance distributions can then be incorporated as penalizing features in an exponential language model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.