Proceedings of the 29th ACM International Conference on Multimedia 2021
DOI: 10.1145/3474085.3475275
|View full text |Cite
|
Sign up to set email alerts
|

UniCon: Unified Context Network for Robust Active Speaker Detection

Abstract: We introduce a new efficient framework, the Unified Context Network (UniCon), for robust active speaker detection (ASD). Traditional methods for ASD usually operate on each candidate's precropped face track separately and do not sufficiently consider the relationships among the candidates. This potentially limits performance, especially in challenging scenarios with low-resolution faces, multiple candidates, etc. Our solution is a novel, unified framework that focuses on jointly modeling multiple types of cont… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
27
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 25 publications
(36 citation statements)
references
References 35 publications
0
27
0
Order By: Relevance
“…Roth et al [28] MobileNet [14] 79.2 Zhang et al [38] 3D ResNet-18 [31] + VGG-M [5] 84.0 MAAS-LAN [19] 2D ResNet-18 [13] 85.1 Chung et al [6] VGG-M [5] + 3D Conv 85.5 ASC [2] 2D ResNet-18 [13] 87.1 MAAS-TAN [19] 2D ResNet-18 [13] 88.8 UniCon [39] 2D ResNet-18 [13] 92.0 TalkNet [35] 2D ResNet-18/34 [13] + 3D Conv 92.3 ASDNet [18] 3D ResNeXt-18 [17] + SincDSNet [27] 93.5 SPELL (Ours) 2D ResNet-18-TSM [13,20] 94.2 SPELL+ (Ours) 2D ResNet-50-TSM [13,20] 94.9…”
Section: Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…Roth et al [28] MobileNet [14] 79.2 Zhang et al [38] 3D ResNet-18 [31] + VGG-M [5] 84.0 MAAS-LAN [19] 2D ResNet-18 [13] 85.1 Chung et al [6] VGG-M [5] + 3D Conv 85.5 ASC [2] 2D ResNet-18 [13] 87.1 MAAS-TAN [19] 2D ResNet-18 [13] 88.8 UniCon [39] 2D ResNet-18 [13] 92.0 TalkNet [35] 2D ResNet-18/34 [13] + 3D Conv 92.3 ASDNet [18] 3D ResNeXt-18 [17] + SincDSNet [27] 93.5 SPELL (Ours) 2D ResNet-18-TSM [13,20] 94.2 SPELL+ (Ours) 2D ResNet-50-TSM [13,20] 94.9…”
Section: Methodsmentioning
confidence: 99%
“…Stage-1 mAP Final mAP ∆mAP MAAS-LAN [19] 79.5 85.1 5.6 ASC [2] 79.5 87.1 7.6 MAAS-TAN [19] 80.2 88.8 8.6 Unicon [39] 84.0 92.0 8.0 ASDNet [18] 88.9…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations