Large-scale search methods are increasingly critical for many content-based visual analysis applications, among which hashing-based approximate nearest neighbor search techniques have attracted broad interests due to their high efficiency in storage and retrieval. However, existing hashing works are commonly designed for measuring data similarity by the Euclidean distances. In this paper, we focus on the problem of learning compact binary codes using the cosine similarity. Specifically, we proposed novel angular reconstructive embeddings (ARE) method, which aims at learning binary codes by minimizing the reconstruction error between the cosine similarities computed by original features and the resulting binary embeddings. Furthermore, we devise two efficient algorithms for optimizing our ARE in continuous and discrete manners, respectively. We extensively evaluate the proposed ARE on several large-scale image benchmarks. The results demonstrate that ARE outperforms several state-of-the-art methods.
Driven by the rapid development of Internet and digital technologies, we have witnessed the explosive growth of Web images in recent years. Seeing that labels can reflect the semantic contents of the images, automatic image annotation, which can further facilitate the procedure of image semantic indexing, retrieval, and other image management tasks, has become one of the most crucial research directions in multimedia. Most of the existing annotation methods, heavily rely on well-labeled training data (expensive to collect) and/or single view of visual features (insufficient representative power). In this paper, inspired by the promising advance of feature engineering (e.g., CNN feature and scale-invariant feature transform feature) and inexhaustible image data (associated with noisy and incomplete labels) on the Web, we propose an effective and robust scheme, termed robust multi-view semi-supervised learning (RMSL), for facilitating image annotation task. Specifically, we exploit both labeled images and unlabeled images to uncover the intrinsic data structural information. Meanwhile, to comprehensively describe an individual datum, we take advantage of the correlated and complemental information derived from multiple facets of image data (i.e., multiple views or features). We devise a robust pairwise constraint on outcomes of different views to achieve annotation consistency. Furthermore, we integrate a robust classifier learning component via l loss, which can provide effective noise identification power during the learning process. Finally, we devise an efficient iterative algorithm to solve the optimization problem in RMSL. We conduct comprehensive experiments on three different data sets, and the results illustrate that our proposed approach is promising for automatic image annotation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.