2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022
DOI: 10.1109/cvpr52688.2022.00458
|View full text |Cite
|
Sign up to set email alerts
|

Knowledge Mining with Scene Text for Fine-Grained Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 30 publications
0
4
0
Order By: Relevance
“…developed a deep learning method based on an end‐to‐end trainable network that mines implicit contextual knowledge behind scene text image and enhance the semantics and correlation to fine‐tune the image representation, which outperformed the SOTA by 3.72% and 5.39%. [ 29 ] Q. Song et al.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…developed a deep learning method based on an end‐to‐end trainable network that mines implicit contextual knowledge behind scene text image and enhance the semantics and correlation to fine‐tune the image representation, which outperformed the SOTA by 3.72% and 5.39%. [ 29 ] Q. Song et al.…”
Section: Discussionmentioning
confidence: 99%
“…[28] For example, H. Wang et al developed a deep learning method based on an end-to-end trainable network that mines implicit contextual knowledge behind scene text image and enhance the semantics and correlation to fine-tune the image representation, which outperformed the SOTA by 3.72% and 5.39%. [29] Q. Song et al proposed a deep learning method with multimodal sparse transformer network (MMST) and achieved a better performance (≈5% lower word error rate compared to SOTA) for different types of noise (−5 to 10 dB).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Since their proposal, subsequent contribution have addressed limitations of their approach and adapted it to other challenges. Recent publications extend the ability of NeRFs to support dynamic scenes [PSB*21; PSH*21; PCPM21; TTG*21; LNSW21], accelerating inference time [RPLG21; MESK22; YLT*21; FYT*22; CXG*22; LSS*21; WZL*22; CFHT23], making them robust against the challenges of in‐the‐wild image capture [MRS*21; TCY*22; MHM*22; RLS*22], reducing the required image count [YYTK21; DLZR22; NBM*22; YPW23; RMY*22] and enabling dynamic relighting [ZSD*21; SDZ*21; MHS*22].…”
Section: Related Workmentioning
confidence: 99%