2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.00624
|View full text |Cite
|
Sign up to set email alerts
|

Transferring Cross-Domain Knowledge for Video Sign Language Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
46
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
4
1

Relationship

2
8

Authors

Journals

citations
Cited by 103 publications
(49 citation statements)
references
References 31 publications
0
46
0
Order By: Relevance
“…Alternative approaches investigate the use of Multiple Instance Learning [6,28,48]. Other recent contributions leverage words from audio-aligned subtitles with keyword spotting methods based on mouthing cues [1], dictionaries [45] and attention maps generated by transformers [61] to annotate large numbers of signs, as well as to learn domain invariant features for improved sign recognition through joint training [36].…”
Section: Related Workmentioning
confidence: 99%
“…Alternative approaches investigate the use of Multiple Instance Learning [6,28,48]. Other recent contributions leverage words from audio-aligned subtitles with keyword spotting methods based on mouthing cues [1], dictionaries [45] and attention maps generated by transformers [61] to annotate large numbers of signs, as well as to learn domain invariant features for improved sign recognition through joint training [36].…”
Section: Related Workmentioning
confidence: 99%
“…On the other hand, Zhang et al in [ 70 ], proposed the Multiple extraction and Multiple prediction (MEMP) network that consists of alternating 3D-CNN networks and Convolutional LSTM layers that extracted spatio-temporal features from video sequences multiple times, enabling the network to achieve 99.06% and 78.85% accuracy in the LSA64 and IsoGD datasets, respectively. Li et al in [ 71 ], proposed a SLR method that was based on the transferring of cross-domain knowledge of news signs to a base model and improve its performance using domain-invariant features.…”
Section: Sign Language Recognitionmentioning
confidence: 99%
“…Much of the prior work has been limited to data collected in a controlled environment. There has been a growing interest in sign language recognition "in the wild" (naturally occurring sign language media), which includes challenging visual conditions such as lighting variation, visual clutter, and motion blur, and often also more natural signing styles [15,16,3,24,32,33,47,46]. Two recently released datasets of fingerspelling in the wild [47,46] include data from 168 signers and tens of thousands of fingerspelling segments; these are the testbeds used in our experiments.…”
Section: Related Workmentioning
confidence: 99%