Zijia Lin scite author profile

Despite its great success, matrix factorization based cross-modality hashing suffers from two problems: 1) there is no engagement between feature learning and binarization; and 2) most existing methods impose the relaxation strategy by discarding the discrete constraints when learning the hash function, which usually yields suboptimal solutions. In this paper, we propose a novel multimodal hashing framework, referred as Unsupervised Deep Cross-Modal Hashing (UDCMH), for multimodal data search in a self-taught manner via integrating deep learning and matrix factorization with binary latent factor models. On one hand, our unsupervised deep learning framework enables the feature learning to be jointly optimized with the binarization. On the other hand, the hashing system based on the binary latent factor models can generate unified binary codes by solving a discrete-constrained objective function directly with no need for a relaxation step. Moreover, novel Laplacian constraints are incorporated into the objective function, which allow to preserve not only the nearest neighbors that are commonly considered in the literature but also the farthest neighbors of data, even if the semantic labels are not available. Extensive experiments on multiple datasets highlight the superiority of the proposed framework over several state-of-the-art baselines.

show abstract

Cross-View Retrieval via Probability-Based Semantics-Preserving Hashing

Lin

Ding

Han

et al. 2017

IEEE Trans. Cybern.

161

View full text Add to dashboard Cite

Abstract-For efficiently retrieving nearest neighbours from large-scale multi-view data, recently hashing methods are widely investigated, which can substantially improve query speeds. In this paper, we propose an effective probability-based SemanticsPreserving Hashing method to tackle the problem of cross-view retrieval, termed SePH. Considering the semantic consistency between views, SePH generates one unified hash code for all observed views of any instance. For training, SePH firstly transforms the given semantic affinities of training data into a probability distribution, and aims to approximate it with another one in Hamming space, via minimizing their KullbackLeibler divergence. Specifically, the latter probability distribution is derived from all pair-wise Hamming distances between tobe-learnt hash codes of the training data. Then with learnt hash codes, any kind of predictive models like linear ridge regression, logistic regression or kernel logistic regression, can be learnt as hash functions in each view for projecting the corresponding view-specific features into hash codes. As for outof-sample extension, given any unseen instance, the learnt hash functions in its observed views can predict view-specific hash codes. Then by deriving or estimating the corresponding output probabilities w.r.t the predicted view-specific hash codes, a novel probabilistic approach is further proposed to utilize them for determining a unified hash code. To evaluate the proposed SePH, we conduct extensive experiments on diverse benchmark datasets, and the experimental results demonstrate that SePH is reasonable and effective.

show abstract

Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on Unlabeled Data in Target Language

Lin

Karlsson

et al. 2020

View full text Add to dashboard Cite

To better tackle the named entity recognition (NER) problem on languages with little/no labeled data, cross-lingual NER must effectively leverage knowledge learned from source languages with rich labeled data. Previous works on cross-lingual NER are mostly based on label projection with pairwise texts or direct model transfer. However, such methods either are not applicable if the labeled data in the source languages is unavailable, or do not leverage information contained in unlabeled data in the target language. In this paper, we propose a teacher-student learning method to address such limitations, where NER models in the source languages are used as teachers to train a student model on unlabeled data in the target language. The proposed method works for both single-source and multi-source crosslingual NER. For the latter, we further propose a similarity measuring method to better weight the supervision from different teacher models. Extensive experiments for 3 target languages on benchmark datasets well demonstrate that our method outperforms existing state-of-theart methods for both single-source and multisource cross-lingual NER.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zijia Lin

Semantics-preserving hashing for cross-view retrieval

IMRAM: Iterative Matching With Recurrent Attention Memory for Cross-Modal Image-Text Retrieval

Unsupervised Deep Hashing via Binary Latent Factor Models for Large-scale Cross-modal Retrieval

Cross-View Retrieval via Probability-Based Semantics-Preserving Hashing

Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on Unlabeled Data in Target Language

Contact Info

Product

Resources

About