Yaoxin Zhuo scite author profile

Yaoxin Zhuo

5Publications

27Citation Statements Received

117Citation Statements Given

How they've been cited

How they cite others

109

117

Affiliations

Arizona State University, Hangzhou Dianzi University

Publications

Order By: Most citations

Weakly Supervised Deep Image Hashing Through Tag Embeddings

Gattupalli

Zhuo

2019

View full text Add to dashboard Cite

Many approaches to semantic image hashing have been formulated as supervised learning problems that utilize images and label information to learn the binary hash codes. However, large-scale labelled image data is expensive to obtain, thus imposing a restriction on the usage of such algorithms. On the other hand, unlabelled image data is abundant due to the existence of many Web image repositories. Such Web images may often come with images tags that contains useful information, although raw tags in general do not readily lead to semantic labels. Motivated by this scenario, we formulate the problem of semantic image hashing as a weakly-supervised learning problem. We utilize the information contained in the user-generated tags associated with the images to learn the hash codes. More specifically, we extract the word2vec semantic embeddings of the tags and use the information contained in them for constraining the learning. Accordingly, we name our model Weakly Supervised Deep Hashing using Tag Embeddings (WDHT). WDHT is tested for the task of semantic image retrieval and is compared against several state-of-art models. Results show that our approach sets a new state-of-art in the area of weekly supervised image hashing.

show abstract

CLIP4Hashing: Unsupervised Deep Hashing for Cross-Modal Video-Text Retrieval

Zhuo

Li²,

Hsiao³

et al. 2022

View full text Add to dashboard Cite

With the ever-increasing multimedia data on the Web, cross-modal video-text retrieval has received a lot of attention in recent years. Deep cross-modal hashing approaches utilize the Hamming space for achieving fast retrieval. However, most existing algorithms have difficulties in seeking or constructing a well-defined joint semantic space. In this paper, an unsupervised deep cross-modal video-text hashing approach (CLIP4Hashing) is proposed, which mitigates the difficulties in bridging between different modalities in the Hamming space through building a single hashing net by employing the pre-trained CLIP model [24]. The approach is enhanced by two novel techniques, the dynamic weighting strategy and the design of the min-max hashing layer, which are found to be the main sources of the performance gain. Compared with conventional deep cross-modal hashing algorithms, CLIP4Hashing does not require data-specific hyper-parameters. With evaluation using three challenging video-text benchmark datasets, we demonstrate that CLIP4Hashing is able to significantly outperform existing state-of-the-art hashing algorithms. Additionally, with larger bit sizes (e.g., 2048 bits), CLIP4Hashing can even deliver competitive performance compared with the results based on non-hashing features. CCS CONCEPTS• Information systems → Multimedia and multimodal retrieval; Video search.

show abstract

Fedns: Improving Federated Learning for Collaborative Image Classification on Mobile Clients

Zhuo

2021

View full text Add to dashboard Cite

Federated Learning (FL) is a paradigm that aims to support loosely connected clients in learning a global model collaboratively with the help of a centralized server. The most popular FL algorithm is Federated Averaging (FedAvg), which is based on taking weighted average of the client models, with the weights determined largely based on dataset sizes at the clients. In this paper, we propose a new approach, termed Federated Node Selection (FedNS), for the server's global model aggregation in the FL setting. FedNS filters and reweights the clients' models at the node/kernel level, hence leading to a potentially better global model by fusing the best components of the clients. Using collaborative image classification as an example, we show with experiments from multiple datasets and networks that FedNS can consistently achieve improved performance over FedAvg.

show abstract

Spatial-temporal analysis on bird habitat discovery in China

Zhan

Zhuo

et al. 2017

View full text Add to dashboard Cite

Weakly Supervised Deep Image Hashing through Tag Embeddings

Gattupalli¹,

Zhuo²,

Li³

2018

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yaoxin Zhuo

Weakly Supervised Deep Image Hashing Through Tag Embeddings

CLIP4Hashing: Unsupervised Deep Hashing for Cross-Modal Video-Text Retrieval

Fedns: Improving Federated Learning for Collaborative Image Classification on Mobile Clients

Spatial-temporal analysis on bird habitat discovery in China

Weakly Supervised Deep Image Hashing through Tag Embeddings

Contact Info

Product

Resources

About