Salient Object Detection by Spatiotemporal and Semantic Features in Real-Time Video Processing Systems

Fang, Yuming; Ding, Guanqun; Wen, Wenying; Yuan, Feiniu; Yang, Yong; Fang, Zhijun; Wang, Lin

doi:10.1109/tie.2019.2956418

Cited by 12 publications

(6 citation statements)

References 64 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In context embedding object detect networks, backbone features are attached to tree parallel branches with dilation sizes of 3, 6 and 12 to form the context embedding module and to incorporate surrounding information. Fang [21] fused the semantic object feature extraction module (Conv2dNet), the spatiotemporal feature extraction module (Conv3DNet) and the saliency feature-sharing module to generate the final saliency map for real-time video processing. Wang [22] combined dual-branch feature extraction and gradually refined the cross-fusion module in the network for camouflaged object detection.…”

Section: Related Workmentioning

confidence: 99%

A Novel Electronic Chip Detection Method Using Deep Neural Networks

2022

View full text Add to dashboard Cite

Electronic chip detection is widely used in electronic industries. However, most existing detection methods cannot handle chip images with multiple classes of chips or complex backgrounds, which are common in real applications. To address these problems, a novel chip detection method that combines attentional feature fusion (AFF) and cosine nonlocal attention (CNLA), is proposed, and it consists of three parts: a feature extraction module, a region proposal module, and a detection module. The feature extraction module combines an AFF-embedded CNLA module and a pyramid feature module to extract features from chip images. The detection module enhances feature maps with a region intermediate feature map by spatial attentional block, fuses multiple feature maps with a multiscale region of the fusion block of interest, and classifies and regresses objects in images with two branches of fully connected layers. Experimental results on a medium-scale dataset comprising 367 images show that our proposed method achieved mAP0.5=0.98745 and outperformed the benchmark method.

show abstract

Section: Related Workmentioning

confidence: 99%

A Novel Electronic Chip Detection Method Using Deep Neural Networks

2022

View full text Add to dashboard Cite

show abstract

“…High-efficiency video coding (HEVC) [1] is the latest video coding standard that was published by ISO/IEC MPEG, and ITU-T VCEG formed the Joint Collaborative Team on Video Coding (JCT-VC) in 2013, which has a high efficiency to compress video. HEVC is adapted to the transmission and storage from small-scale multimedia networks to large scale TV distributors and thus has been widely used in daily life [2][3][4][5]. Video contains an enormous amount of information including private, sensitive and copyright items [6][7][8], which would be easily leaked in an unreliable public channel and the insecurity of the cloud service.…”

Section: Introductionmentioning

confidence: 99%

A multi-level approach with visual information for encrypted H.265/HEVC videos（China MM）

Wen

Zhang

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

High-efficiency video coding (HEVC) encryption has been proposed to encrypt syntax elements for the purpose of video encryption. To achieve high video security, to the best of our knowledge, almost all of the existing HEVC encryption algorithms mainly encrypt the whole video, such that the user without permissions cannot obtain any viewable information. However, these encryption algorithms cannot meet the needs of customers who need part of the information but not the full information in the video. In many cases, such as professional paid videos or video meetings, users would like to observe some visible information in the encrypted video of the original video to satisfy their requirements in daily life. Aiming at this demand, this paper proposes a multi-level encryption scheme that is composed of lightweight encryption, medium encryption and heavyweight encryption, where each encryption level can obtain a different amount of visual information. First, we employ AES-CTR to generate a pseudo-random number sequence. Then, the main syntax elements in the H.265/HEVC encoding process are encrypted by a pseudorandom sequence. In the lightweight encryption level, the syntax element of the luma intraprediction model is chosen for encryption. In the medium encryption level, the syntax element of the discrete cosine transform (DCT) coefficient sign is employed for scrambling encryption. In the heavyweight encryption level, syntax elements of both the luma intraprediction model and the DCT coefficient sign are encrypted simultaneously by the pseudorandom sequence. It is found that both encrypting the luma intraprediction model (IPM) and scrambling the syntax element of the DCT coefficient sign can achieve the performance of a distorted video in which there is still residual visual information, while encrypting both of them can implement the intensity of encryption and one cannot gain any visual information. The experimental results meet our expectations appropriately, indicating that there is a different amount of visual information in each encryption level. Meanwhile, users can flexibly choose the encryption level according to their various requirements.

show abstract

“…individual RGB/color images [25]- [30] or sequences [31]- [35]. As depth cameras, such as Kinect and RealSense, become more and more popular, SOD from RGB-D inputs ("D" refers to depth) is emerging as an attractive research topic.…”

Section: Introductionmentioning

confidence: 99%

Siamese Network for RGB-D Salient Object Detection and Beyond

Fan

et al. 2020

Preprint

View full text Add to dashboard Cite

Salient Object Detection by Spatiotemporal and Semantic Features in Real-Time Video Processing Systems

Cited by 12 publications

References 64 publications

A Novel Electronic Chip Detection Method Using Deep Neural Networks

A Novel Electronic Chip Detection Method Using Deep Neural Networks

A multi-level approach with visual information for encrypted H.265/HEVC videos（China MM）

Siamese Network for RGB-D Salient Object Detection and Beyond

Contact Info

Product

Resources

About