H2FA R-CNN: Holistic and Hierarchical Feature Alignment for Cross-domain Weakly Supervised Object Detection

Xu, Yunqiu; Yang, Zongxin; Miao, Jiaxu; Yang, Yi

doi:10.1109/cvpr52688.2022.01393

Cited by 20 publications

(8 citation statements)

References 53 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A frequency division operation can extract feature maps at different frequencies to achieve the goal of preserving detail and compressing noise. Recently, data-driven approaches based on generative adversarial networks (GANs) or convolution neural networks (CNNs) have shown strong feature representation capability, which was widely applied in image enhancement, image super-resolution, object recognition, and so on [ 42 , 43 , 44 , 45 , 63 ]. Unfortunately, although these LLIE methods significantly promote contrast, saturation, and brightness, remove the color deviation, and highlight the structural details, they heavily depend on computer resources owing to the depth or width of the network.…”

Section: Methodsmentioning

confidence: 99%

“…Multiscale learning structure: Generally, the image exhibits different characteristics at various scales, and a multiscale representation can effectively extract its information at different scales and promote the performance of learning-based methods [ 15 , 56 ]. As a result, the multiscale learning strategy has broadly been conducted on object identification, pose recognition, face detection, and other computer vision tasks [ 42 , 43 , 44 , 45 ]. However, this strategy is rarely considered in most state-of-the-art LLIE models.…”

Section: Methodsmentioning

confidence: 99%

“…In recent years, learning-based methods containing supervised and unsupervised learning strategies have outperformed traditional approaches in feature representation and extraction and have been applied in object detection, image processing, and other computer vision assignments [ 42 , 43 , 44 , 45 ]. LLNet [ 27 ], a groundbreaking work for LLIE, stacked sparse denoising autoencoders for light improvement and denoising at once.…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

FDMLNet: A Frequency-Division and Multiscale Learning Network for Enhancing Low-Light Image

Gong

Liu

et al. 2022

Sensors

View full text Add to dashboard Cite

Low-illumination images exhibit low brightness, blurry details, and color casts, which present us an unnatural visual experience and further have a negative effect on other visual applications. Data-driven approaches show tremendous potential for lighting up the image brightness while preserving its visual naturalness. However, these methods introduce hand-crafted holes and noise enlargement or over/under enhancement and color deviation. For mitigating these challenging issues, this paper presents a frequency division and multiscale learning network named FDMLNet, including two subnets, DetNet and StruNet. This design first applies the guided filter to separate the high and low frequencies of authentic images, then DetNet and StruNet are, respectively, developed to process them, to fully explore their information at different frequencies. In StruNet, a feasible feature extraction module (FFEM), grouped by multiscale learning block (MSL) and a dual-branch channel attention mechanism (DCAM), is injected to promote its multiscale representation ability. In addition, three FFEMs are connected in a new dense connectivity meant to utilize multilevel features. Extensive quantitative and qualitative experiments on public benchmarks demonstrate that our FDMLNet outperforms state-of-the-art approaches benefiting from its stronger multiscale feature expression and extraction ability.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

FDMLNet: A Frequency-Division and Multiscale Learning Network for Enhancing Low-Light Image

Gong

Liu

et al. 2022

Sensors

View full text Add to dashboard Cite

show abstract

“…They focus on object classification or object detection tasks. Related methods try adapting classifiers or detectors from natural to artificial images [74,80]. However, as demonstrated in Table 5, several limitations of these datasets make them hard to bridge natural and artificial human-centric tasks.…”

Section: Datasets For Multi-scenario Generalizationmentioning

confidence: 99%

“…[69] fine-tune Faster R-CNN on People-Art to detect humans in artworks. H2FA R-CNN [74] proposes a Holistic and Hierarchical Feature Alignment R-CNN to enforce image-level alignment for object detection. [15] use image-level domain transfer and pseudo-labels from the source domain to train object detector SSD300 [35].…”

Section: Datasets For Multi-scenario Generalizationmentioning

confidence: 99%

Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes

Ju¹,

Zeng²,

Wang³

et al. 2023

Preprint

View full text Add to dashboard Cite

Humans have long been recorded in a variety of forms since antiquity. For example, sculptures and paintings were the primary media for depicting human beings before the invention of cameras. However, most current human-centric computer vision tasks like human pose estimation and human image generation focus exclusively on natural images in the real world. Artificial humans, such as those in sculptures, paintings, and cartoons, are commonly neglected, making existing models fail in these scenarios.As an abstraction of life, art incorporates humans in both natural and artificial scenes. We take advantage of it and introduce the Human-Art dataset to bridge related tasks in natural and artificial scenarios. Specifically, Human-Art contains 50k high-quality images with over 123k person instances from 5 natural and 15 artificial scenarios, which are annotated with bounding boxes, keypoints, self-contact points, and text information for humans represented in both 2D and 3D. It is, therefore, comprehensive and versatile for various downstream tasks. We also provide a rich set of baseline results and detailed analyses for related tasks, including human detection, 2D and 3D human pose estimation, image generation, and motion transfer. As a challenging dataset, we hope Human-Art can provide insights for relevant research and open up new research questions.

show abstract

Low-light DEtection TRansformer (LDETR): object detection in low-light and adverse weather conditions

Tiwari,

Pattanaik,

Sharma

2024

Multimed Tools Appl

View full text Add to dashboard Cite

H2FA R-CNN: Holistic and Hierarchical Feature Alignment for Cross-domain Weakly Supervised Object Detection

Cited by 20 publications

References 53 publications

FDMLNet: A Frequency-Division and Multiscale Learning Network for Enhancing Low-Light Image

FDMLNet: A Frequency-Division and Multiscale Learning Network for Enhancing Low-Light Image

Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes

Low-light DEtection TRansformer (LDETR): object detection in low-light and adverse weather conditions

Contact Info

Product

Resources

About