Single mask lateral tunneling accelerometer

Despite their known weaknesses, hidden Markov models (HMMs) have been the dominant technique for acoustic modeling in speech recognition for over two decades. Still, the advances in the HMM framework have not solved its key problems: it discards information about time dependencies and is prone to overgeneralization. In this paper, we attempt to overcome these problems by relying on straightforward template matching. The basis for the recognizer is the well-known DTW algorithm. However, classical DTW continuous speech recognition results in an explosion of the search space. The traditional top-down search is therefore complemented with a data-driven selection of candidates for DTW alignment. We also extend the DTW framework with a flexible subword unit mechanism and a class sensitive distance measure-two components suggested by state-of-the-art HMM systems. The added flexibility of the unit selection in the template-based framework leads to new approaches to speaker and environment adaptation. The template matching system reaches a performance somewhat worse than the best published HMM results for the Resource Management benchmark, but thanks to complementarity of errors between the HMM and DTW systems, the combination of both leads to a decrease in word error rate with 17% compared to the HMM results.

show abstract

Augmenting the Radio Experience by Enhancing Interactions between Radio Editors and Listeners

Claes

Bauwens

Matton

2018

View full text Add to dashboard Cite

Minimum classification error training in example based speech and pattern recognition using sparse weight matrices

Matton

Compernolle

Cools

2010

Journal of Computational and Applied Mathematics

View full text Add to dashboard Cite

Enabling Semantic Search in a News Production Environment

Debevere

Deursen

Rijsselbergen

et al. 2011

View full text Add to dashboard Cite

Abstract. News production is characterized by a complex and dynamic workflow, in which it is important to produce and broadcast reliable news as fast as possible. In this process, the efficient retrieval of previously broadcasted news items is important, both for gathering background information and for reuse of footage in new reports. This paper discusses how the quality of descriptive metadata of news items can be optimized, by collecting data generated during news production. Starting from a description of the news production process of the Flemish public service broadcaster in Belgium (VRT), information systems containing valuable metadata are identified. Subsequently, we present a data model that uniformly represents the available information generated during news production. This data model is then implemented using Semantic Web technologies. Further, we describe how other valuable data sets, present in the Semantic Web, are connected to the data model, enabling semantic search operations.

show abstract

The MPEG-7 Audiovisual Description Profile (AVDP) and its application to multi-view video

Sano¹,

Bailer

Messina

et al. 2013

View full text Add to dashboard Cite

This paper describes a new MPEG-7 profile called AVDP (Audiovisual Description Profile). Firstly, some problems with conventional MPEG-7 profiles are described and the motivation behind the development of AVDP is explained based on requirements from broadcasters and other actors from the media industry. Secondly, the scope and functionalities of AVDP are described. Differences from the existing profiles and the basic AVDP structure and components are explained. Some useful software tools handling AVDP, including for validation and visualization are discussed. Finally the use of AVDP to represent multi-view and panoramic video content is described.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mike Matton

Template-Based Continuous Speech Recognition

Augmenting the Radio Experience by Enhancing Interactions between Radio Editors and Listeners

Minimum classification error training in example based speech and pattern recognition using sparse weight matrices

Enabling Semantic Search in a News Production Environment

The MPEG-7 Audiovisual Description Profile (AVDP) and its application to multi-view video

Contact Info

Product

Resources

About