Proceedings of the Working Conference on Advanced Visual Interfaces - AVI '06 2006
DOI: 10.1145/1133265.1133338
|View full text |Cite
|
Sign up to set email alerts
|

The prospects for unrestricted speech input for TV content search

Abstract: The need for effective search for television content is growing as the number of choices for TV viewing and/or recording explodes. In this paper we describe a preliminary prototype of a multimodal Speech-In List-Out (SILO) interface in which users' input is unrestricted by vocabulary or grammar. We report on usability testing with a sample of six users. The prototype enables search through video content metadata download from an electronic program guide (EPG) service. Our setup for testing included adding a mi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2008
2008
2010
2010

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(17 citation statements)
references
References 7 publications
0
11
0
Order By: Relevance
“…In this way, speech-based queries could be better matched with speech-based metadata that pertains to the content, which in our example would be restaurant names, menu items, information about ratings, etc. Similar challenges for search and retrieval of multimedia information as described in this example also exist in the mobile and television environments, e.g., see the work of Wittenburg, et al [39] on applying speech-based query to EPG search on the television. While the solutions may be different depending on the context and type of media to be consumed, it is necessary to identify common requirements for future metadata standards that enable richer forms of multimodal input, such as voice and gestures, to be used for multimedia information retrieval as illustrated in Fig.…”
Section: A Ease Of Usementioning
confidence: 82%
“…In this way, speech-based queries could be better matched with speech-based metadata that pertains to the content, which in our example would be restaurant names, menu items, information about ratings, etc. Similar challenges for search and retrieval of multimedia information as described in this example also exist in the mobile and television environments, e.g., see the work of Wittenburg, et al [39] on applying speech-based query to EPG search on the television. While the solutions may be different depending on the context and type of media to be consumed, it is necessary to identify common requirements for future metadata standards that enable richer forms of multimodal input, such as voice and gestures, to be used for multimedia information retrieval as illustrated in Fig.…”
Section: A Ease Of Usementioning
confidence: 82%
“…This is timeconsuming and complex for users. Emerging multimodal [10], gestural [11] and auxiliary [12] input interfaces show promise but currently address niche requirements and require more research before they can realistically match user expectations.…”
Section: Inline Search For Tvmentioning
confidence: 99%
“…After a large consumer survey [3], we have focused on developing a multimodal user interface for a media center. This application area is rapidly becoming popular in homes, and provides opportunities and challenges for user multimodal interaction [1,2]. Our media center provides users full control over digital television content, including an advanced electronic program guide (EPG).…”
Section: Media Center Applicationmentioning
confidence: 99%