Reasonable object detection guided by knowledge of global context and category relationship

Ji, Haoqin; Ye, Kai; Wan, Qi; Shen, Linlin

doi:10.1016/j.eswa.2022.118285

Search citation statements

Order By: Relevance

Paper Sections

Select...

Introduction1

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2023

2024

Publication Types

Select...

Article2

Book1

Relationship

Self Cite0

Independent3

Authors

Journals

Cited by 3 publications

(1 citation statement)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…First, we notice that using the spatial encoder to implicitly extract intra-frame contexts yields a very small benefit. Since the global scene context is useful for vision-related tasks (Wang et al, 2019;Zhang et al, 2021;Ji et al, 2022), we extend the spatial encoder to explicitly generate a global feature vector for each frame. Inspired by Vision Transformer (ViT) (Dosovitskiy et al, 2021), we prepend a learnable class token to the spatial encoder input, which captures the global relationship among all human-object pairs at a particular moment.…”

Section: Introductionmentioning

confidence: 99%