Speech timing prediction in multimodal human-computer interaction

Bourguet, Marie-Luce; Ando, Akio

doi:10.1007/978-0-387-35175-9_70

Cited by 5 publications

(6 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…But all the proposed models are essentially of a qualitative nature. Moreover, very few studies have been done in the Human-Computer Interaction (HCI) domain [1] [3].…”

Section: Introductionmentioning

confidence: 99%

Synchronization of speech and hand gestures during multimodal human-computer interaction

Bourguet

Ando

1998

CHI 98 Conference Summary on Human Factors in Computing Systems

Self Cite

View full text Add to dashboard Cite

In this paper, we describe an experiment that studies temporal synchronization between speech (Japanese) and hand pointing gestures. Gesture (G) is shown to be synchronized with either the nominal or deictic ("this", "that", 'here", etc.) expression of a phrase. It is also shown that G is predictable in the [-200 ms, 400 ms] interval around the beginning of its related expression. The use of such a quantitative model of natural speech and gesture integration (in the multimodal interface and the speech recognition system), is also discussed

show abstract

“…But all the proposed models are essentially of a qualitative nature. Moreover, very few studies have been done in the Human-Computer Interaction (HCI) domain [1] [3].…”

Section: Introductionmentioning

confidence: 99%

Synchronization of speech and hand gestures during multimodal human-computer interaction

Bourguet

Ando

1998

CHI 98 Conference Summary on Human Factors in Computing Systems

Self Cite

View full text Add to dashboard Cite

show abstract

“…This does not necessarily mean that there will be no overlap in performance, e.g. it is possible that someone can begin to move the cursor on a drawing package and then speak a command before the cursor has reached its end-point (Bourguet & Ando, 1997). However, it does mean that MHCI is not simply a matter of combining two or more &unimodal' unit-tasks.…”

Section: Discussionmentioning

confidence: 94%

Using critical path analysis to model multimodal human–computer interaction

Baber¹,

Mellor²

2001

International Journal of Human-Computer Studies

View full text Add to dashboard Cite

“…For example, timing information from 3D hand pointing gestures has been used to automatically detect recognition errors in speech [53] [54]. Experimental studies have shown that, during speech and gesture multimodal interaction, 3D hand pointing gestures tend to be synchronised with either the nominal or the deictic ("this", "that", "here", etc.)…”

Section: Automatic Detectionmentioning

confidence: 99%

“…In [54], it is shown that the use of a speech and hand gestures synchronisation model can result in the recovery of up to a third of speech recognition errors.…”

Section: Automatic Detectionmentioning

confidence: 99%

Towards a taxonomy of error-handling strategies in recognition-based multi-modal human–computer interfaces

Bourguet

2006

Signal Processing

Self Cite

View full text Add to dashboard Cite

In this paper, we survey the different types of error-handling strategies that have been described in the literature on recognition-based human-computer interfaces. A wide range of strategies can be found in spoken human-machine dialogues, handwriting systems, and multimodal natural interfaces.We then propose a taxonomy for classifying error-handling strategies that has the following three dimensions: the main actor in the error-handling process (machine versus user), the purpose of the strategy (error prevention, discovery, or correction), and the use of different modalities of interaction. The requirements that different error-handling strategies have on different sets of interaction modalities are also discussed. The main aim of this work is to establish a classification that can serve as a tool for understanding how to develop more efficient and more robust multimodal human-machine interfaces.Keywords: recognition-based technology, multimodal interfaces, error-handling, taxonomy, interaction design, interaction robustness Recognition-based technologyMultimodal interaction refers to interaction with the virtual and physical environment through natural modes of communication such as speech, body gestures, handwriting, graphics, or gaze.Unlike keyboards and mice inputs, natural modes of communication usually are non-deterministic, and have to be "recognised" by a recognition system, before they can be passed on to an application.Recent developments in recognition-based technology (e.g. speech and gesture recognition) have opened a myriad of new possibilities for the design and implementation of multimodal applications.Handwriting recognisers, for example, are being used in personal digital assistants (e.g. Paragon's multilingual PenReader software for Pocket PC devices), and speech recognition has made its way into desktop machines (e.g. IBM's ViaVoice TM speech recognition engines). However, designing and

show abstract

Speech timing prediction in multimodal human-computer interaction

Cited by 5 publications

References 9 publications

Synchronization of speech and hand gestures during multimodal human-computer interaction

Synchronization of speech and hand gestures during multimodal human-computer interaction

Using critical path analysis to model multimodal human–computer interaction

Towards a taxonomy of error-handling strategies in recognition-based multi-modal human–computer interfaces

Contact Info

Product

Resources

About