Shafin Rahman scite author profile

Current Zero-Shot Learning (ZSL) approaches are restricted to recognition of a single dominant unseen object category in a test image. We hypothesize that this setting is ill-suited for real-world applications where unseen objects appear only as a part of a complex scene, warranting both the 'recognition' and 'localization' of an unseen category. To address this limitation, we introduce a new 'Zero-Shot Detection' (ZSD) problem setting, which aims at simultaneously recognizing and locating object instances belonging to novel categories without any training examples. We also propose a new experimental protocol for ZSD based on the highly challenging ILSVRC dataset, adhering to practical issues, e.g., the rarity of unseen objects. To the best of our knowledge, this is the first end-to-end deep network for ZSD that jointly models the interplay between visual and semantic domain information. To overcome the noise in the automatically derived semantic descriptions, we utilize the concept of meta-classes to design an original loss function that achieves synergy between max-margin class separation and semantic space clustering. Furthermore, we present a baseline approach extended from recognition to detection setting. Our extensive experiments show significant performance boost over the baseline on the imperative yet difficult ZSD problem.

show abstract

Spacetime geometry of static fluid spheres

Rahman

Visser

2002

Class. Quantum Grav.

View full text Add to dashboard Cite

We exhibit a simple and explicit formula for the metric of an arbitrary static spherically-symmetric perfect-fluid spacetime. This class of metrics depends on one freely specifiable monotonic non-increasing generating function. We also investigate various regularity conditions and the constraints they impose. Because we never make any assumptions as to the nature (or even the existence) of an equation of state, this technique is useful in situations where the equation of state is for whatever reason uncertain or unknown.To illustrate the power of the method we exhibit a new form of the ‘Goldman-I’ exact solution. This is a three-parameter closed-form exact solution given in terms of algebraic combinations of quadratics. It interpolates between (and thereby unifies) at least six other reasonably well-known exact solutions.

show abstract

A Unified Approach for Conventional Zero-Shot, Generalized Zero-Shot, and Few-Shot Learning

Rahman

Khan

Porikli

2018

IEEE Trans. on Image Process.

157

View full text Add to dashboard Cite

Prevalent techniques in zero-shot learning do not generalize well to other related problem scenarios. Here, we present a unified approach for conventional zero-shot, generalized zero-shot and few-shot learning problems. Our approach is based on a novel Class Adapting Principal Directions (CAPD) concept that allows multiple embeddings of image features into a semantic space. Given an image, our method produces one principal direction for each seen class. Then, it learns how to combine these directions to obtain the principal direction for each unseen class such that the CAPD of the test image is aligned with the semantic embedding of the true class, and opposite to the other classes. This allows efficient and class-adaptive information transfer from seen to unseen classes. In addition, we propose an automatic process for selection of the most useful seen classes for each unseen class to achieve robustness in zero-shot learning. Our method can update the unseen CAPD taking the advantages of few unseen images to work in a few-shot learning scenario. Furthermore, our method can generalize the seen CAPDs by estimating seen-unseen diversity that significantly improves the performance of generalized zero-shot learning. Our extensive evaluations demonstrate that the proposed approach consistently achieves superior performance in zero-shot, generalized zero-shot and few/one-shot learning problems.

show abstract

Improved Visual-Semantic Alignment for Zero-Shot Object Detection

Rahman

Khan

Barnes

2020

AAAI

View full text Add to dashboard Cite

Zero-shot object detection is an emerging research topic that aims to recognize and localize previously ‘unseen’ objects. This setting gives rise to several unique challenges, e.g., highly imbalanced positive vs. negative instance ratio, proper alignment between visual and semantic concepts and the ambiguity between background and unseen classes. Here, we propose an end-to-end deep learning framework underpinned by a novel loss function that handles class-imbalance and seeks to properly align the visual and semantic cues for improved zero-shot learning. We call our objective the ‘Polarity loss’ because it explicitly maximizes the gap between positive and negative predictions. Such a margin maximizing formulation is not only important for visual-semantic alignment but it also resolves the ambiguity between background and unseen objects. Further, the semantic representations of objects are noisy, thus complicating the alignment between visual and semantic domains. To this end, we perform metric learning using a ‘Semantic vocabulary’ of related concepts that refines the noisy semantic embeddings and establishes a better synergy between visual and semantic domains. Our approach is inspired by the embodiment theories in cognitive science, that claim human semantic understanding to be grounded in past experiences (seen objects), related linguistic concepts (word vocabulary) and the visual perception (seen/unseen object images). Our extensive results on MS-COCO and Pascal VOC datasets show significant improvements over state of the art.1

show abstract

Transductive Learning for Zero-Shot Object Detection

Rahman

Khan

Barnes

2019

View full text Add to dashboard Cite

On computational modeling of visual saliency: Examining what’s right, and what’s left

et al. 2015

View full text Add to dashboard Cite

In the past decade, a large number of computational models of visual saliency have been proposed. Recently a number of comprehensive benchmark studies have been presented, with the goal of assessing the performance landscape of saliency models under varying conditions. This has been accomplished by considering fixation data, annotated image regions, and stimulus patterns inspired by psychophysics. In this paper, we present a high-level examination of challenges in computational modeling of visual saliency, with a heavy emphasis on human vision and neural computation. This includes careful assessment of different metrics for performance of visual saliency models, and identification of remaining difficulties in assessing model performance. We also consider the importance of a number of issues relevant to all saliency models including scale-space, the impact of border effects, and spatial or central bias. Additionally, we consider the biological plausibility of models in stepping away from exemplar input patterns towards a set of more general theoretical principles consistent with behavioral experiments. As a whole, this presentation establishes important obstacles that remain in visual saliency modeling, in addition to identifying a number of important avenues for further investigation.

show abstract

Zero-shot Learning of 3D Point Cloud Objects

Cheraghian

Rahman

2019

View full text Add to dashboard Cite

Recent deep learning architectures can recognize instances of 3D point cloud objects of previously seen classes quite well. At the same time, current 3D depth camera technology allows generating/segmenting a large amount of 3D point cloud objects from an arbitrary scene, for which there is no previously seen training data. A challenge for a 3D point cloud recognition system is, then, to classify objects from new, unseen, classes. This issue can be resolved by adopting a zero-shot learning (ZSL) approach for 3D data, similar to the 2D image version of the same problem. ZSL attempts to classify unseen objects by comparing semantic information (attribute/word vector) of seen and unseen classes. Here, we adapt several recent 3D point cloud recognition systems to the ZSL setting with some changes to their architectures. To the best of our knowledge, this is the first attempt to classify unseen 3D point cloud objects in the ZSL setting. A standard protocol (which includes the choice of datasets and the seen/unseen split) to evaluate such systems is also proposed. Baseline performances are reported using the new protocol on the investigated models. This investigation throws a new challenge to the 3D point cloud recognition community that may instigate numerous future works.

show abstract

Synthesizing the Unseen for Zero-Shot Object Detection

Hayat¹,

Hayat²,

Rahman³

et al. 2021

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shafin Rahman

Zero-Shot Object Detection: Learning to Simultaneously Recognize and Localize Novel Concepts

Spacetime geometry of static fluid spheres

A Unified Approach for Conventional Zero-Shot, Generalized Zero-Shot, and Few-Shot Learning

Improved Visual-Semantic Alignment for Zero-Shot Object Detection

Transductive Learning for Zero-Shot Object Detection

On computational modeling of visual saliency: Examining what’s right, and what’s left

Zero-shot Learning of 3D Point Cloud Objects

Synthesizing the Unseen for Zero-Shot Object Detection

Contact Info

Product

Resources

About