2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP) 2016
DOI: 10.1109/globalsip.2016.7906033
|View full text |Cite
|
Sign up to set email alerts
|

Active speaker detection in human machine multiparty dialogue using visual prosody information

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
19
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 14 publications
(21 citation statements)
references
References 16 publications
2
19
0
Order By: Relevance
“…In these studies, it was shown that lip information in the speech section and time section immediately before speech is useful for improving the performance of ASD. These previous research results [30][31][32][33][34]36] support the validity of our approach for predicting the next speaker and utterance interval using the mouth opening pattern at the end of an utterance.…”
Section: Mouth-opening Movement and Speakingsupporting
confidence: 77%
See 1 more Smart Citation
“…In these studies, it was shown that lip information in the speech section and time section immediately before speech is useful for improving the performance of ASD. These previous research results [30][31][32][33][34]36] support the validity of our approach for predicting the next speaker and utterance interval using the mouth opening pattern at the end of an utterance.…”
Section: Mouth-opening Movement and Speakingsupporting
confidence: 77%
“…To focus on lip information, Cutler et al used the image feature of the mouth region according to the audio information [33]. Haider et al used head movements in addition to lip information in speech [34]. They also showed that the features of lip and head movement one second before the start of speech are useful for improving the function of ASD [35].…”
Section: Mouth-opening Movement and Speakingmentioning
confidence: 99%
“…This study continues the authors' past work [8,9] which demonstrated the use of lip and head movements during speech articulation for active speaker detection but did not assess the discriminative power of visual prosody data captured just before and/or after articulation. In this study, we propose methods for detection of active speakers through use of visual prosody information one second before/after speech articulation and also evaluate the visual prosody information of the first second of the speech utterance.…”
Section: Introductionsupporting
confidence: 78%
“…An audio-visual dataset [8,9] was collected in a task-free dialogue setting. Four participants (3 males and 1 female) converse with the "machine", but they are not allowed to speak with each other directly.…”
Section: Data Collectionmentioning
confidence: 99%
See 1 more Smart Citation