Shaochen Jiang scite author profile

The existing learning-based unsupervised hashing method usually uses a pre-trained network to extract features, and then uses the extracted feature vectors to construct a similarity matrix which guides the generation of hash codes through gradient descent. Existing research shows that the algorithm based on gradient descent will cause the hash codes of the paired images to be updated toward each other’s position during the training process. For unsupervised training, this situation will cause large fluctuations in the hash code during training and limit the learning efficiency of the hash code. In this paper, we propose a method named Deep Unsupervised Hashing with Gradient Attention (UHGA) to solve this problem. UHGA mainly includes the following contents: (1) use pre-trained network models to extract image features; (2) calculate the cosine distance of the corresponding features of the pair of images, and construct a similarity matrix through the cosine distance to guide the generation of hash codes; (3) a gradient attention mechanism is added during the training of the hash code to pay attention to the gradient. Experiments on two existing public datasets show that our proposed method can obtain more discriminating hash codes.

show abstract

Dependence of Thermal Contact Properties on Compression Pressure

Shen

Jiang

Sukigara

2017

JFST

View full text Add to dashboard Cite

LIANet: Layer Interactive Attention Network for RGB-D Salient Object Detection

Han

Wang

et al. 2022

IEEE Access

View full text Add to dashboard Cite

RGB-D salient object detection (SOD) usually describes two modes' classification or regression problem, namely RGB and depth. The existing RGB-D SOD methods use depth hints to increase the detection performance, meanwhile they focus on the quality of little depth maps. In practical application, the interference of various problems in the acquisition process affects the depth map quality, which dramatically reduces the detection effect. In this paper, to minimize interference in depth mapping and emphasize prominent objects in RGB images, we put forward a layered interactive attention network (LIANet). In general, this network consists of three essential parts: feature coding, layered fusion mechanism, and feature decoding. In the feature coding stage, three-dimensional weight is introduced to the features of each layer without adding network parameters, and it is also a lightweight module. The layered fusion mechanism is the most critical part of this paper. RGB and depth maps are used alternately for layered interaction and fusion to enhance RGB feature information and gradually integrate global context information at a single scale. In addition, we also used mixed losses to optimize further and train our model. Finally, a mass of experiments on six standard datasets demonstrated the importance of the method, and a timely detection speed reaches 30 fps on every dataset.

show abstract

Improving Performance in Person Reidentification Using Adaptive Multiple Loss Baseline

Huang

Wang

et al. 2022

Information

View full text Add to dashboard Cite

Currently, deep learning is the mainstream method to solve the problem of person reidentification. With the rapid development of neural networks in recent years, a number of neural network frameworks have emerged for it, so it is becoming more important to explore a simple and efficient baseline algorithm. In fact, the performance of the same module varies greatly in different positions of the network architecture. After exploring how modules can play a maximum role in the network and studying and summarizing existing algorithms, we designed an adaptive multiple loss baseline (AML) with a simple structure but powerful functions. In this network, we use an adaptive mining sample loss (AMS) and other modules, which can mine more information from input samples at the same time. Based on triplet loss, AMS loss can optimize the distance between the input sample and its positive and negative samples and protect structural information within the sample. During the experiment, we conducted several group tests and confirmed the high performance of AML baseline via the results. AML baseline has outstanding performance in three commonly used datasets. The two indicators of AML baseline on CUHK-03 are 25.7% and 26.8% higher than BagTricks.

show abstract

MSViT: Training Multiscale Vision Transformers for Image Retrieval

Jiang

et al. 2024

IEEE Trans. Multimedia

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shaochen Jiang

Unsupervised Hashing with Gradient Attention

Dependence of Thermal Contact Properties on Compression Pressure

LIANet: Layer Interactive Attention Network for RGB-D Salient Object Detection

Improving Performance in Person Reidentification Using Adaptive Multiple Loss Baseline

MSViT: Training Multiscale Vision Transformers for Image Retrieval

Contact Info

Product

Resources

About