Out-Of-Distribution Detection for Generalized Zero-Shot Action Recognition

Mandal, D. K.; Narayan, Sanath; Dwivedi, Sai Kumar; Gupta, Varun; Ahmed, Shuaib; Khan, Fahad Shahbaz; Shao, Ling

doi:10.1109/cvpr.2019.01022

Cited by 116 publications

(101 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We use = 2048-dimension ResNet101 [8] features for images in AWA1 and CUB while manually defined attributes of dimension = 85 and = 312 for AWA1 and CUB, respectively. For HMDB51 and UCF101, we use the I3D [2] video feature of = 8196 dimension and = 300-dim word2vec as semantic prototypes provided by [16]. For AWA1 and CUB we use the new seen/unseen split proposed by [34] while for HMDB1 and UCF101 we use the seen/unseen splits provided by [16] where we experiment on 30 splits and report the average performance.…”

Section: Methodsmentioning

confidence: 99%

“…For HMDB51 and UCF101, we use the I3D [2] video feature of = 8196 dimension and = 300-dim word2vec as semantic prototypes provided by [16]. For AWA1 and CUB we use the new seen/unseen split proposed by [34] while for HMDB1 and UCF101 we use the seen/unseen splits provided by [16] where we experiment on 30 splits and report the average performance. We use the evaluation criterion proposed by [34] where we report the average top 1 accuracy on the seen ( ) and the unseen ( ) classes and the harmonic mean ( ) of and .…”

Section: Methodsmentioning

confidence: 99%

“…Amongst different GZSL approaches, non-generative models aim to learn deterministic or stochastic functions given the semantic and the visual spaces [9,10,15,21,23,24,29,31,[37][38][39]. On the other hand, generative techniques for GZSL focus to combat the classimbalance issue in GZSL by modeling the underlying data distributions [4,12,14,16,19,27,35,36]. Notably, [35] uses WGAN [6], which is trained to generate visual samples of the seen classes from the corresponding seen class-prototypes.…”

Section: Related Workmentioning

confidence: 99%

“…[27] proposes to match the gradients of generated and real samples to improve sample generation further. [16] extend the approach in [35] and train a separate classifier for seen and unseen classes. [14] proposes to alleviate confusion between seen and unseen class samples and introduce feature confusion scores.…”

Section: Related Workmentioning

confidence: 99%

See 3 more Smart Citations

Generalized Zero-Shot Learning using Generated Proxy Unseen Samples and Entropy Separation

Gune

Banerjee

Chaudhuri

et al. 2020

Proceedings of the 28th ACM International Conference on Multimedia

View full text Add to dashboard Cite

The recent generative model-driven Generalized Zero-shot Learning (GZSL) techniques overcome the prevailing issue of the model bias towards the seen classes by synthesizing the visual samples of the unseen classes through leveraging the corresponding semantic prototypes. Although such approaches significantly improve the GZSL performance due to data augmentation, they violate the principal assumption of GZSL regarding the unavailability of semantic information of unseen classes during training. In this work, we propose to use a generative model (GAN) for synthesizing the visual proxy samples while strictly adhering to the standard assumptions of the GZSL. The aforementioned proxy samples are generated by exploring the early training regime of the GAN. We hypothesize that such proxy samples can effectively be used to characterize the average entropy of the label distribution of the samples from the unseen classes. Further, we train a classifier on the visual samples from the seen classes and proxy samples using entropy separation criterion such that an average entropy of the label distribution is low and high, respectively, for the visual samples from the seen classes and the proxy samples. Such entropy separation criterion generalizes well during testing where the samples from the unseen classes exhibit higher entropy than the entropy of the samples from the seen classes. Subsequently, low and high entropy samples are classified using supervised learning and ZSL rather than GZSL. We show the superiority of the proposed method by experimenting on AWA1, CUB, HMDB51, and UCF101 datasets. CCS CONCEPTS • Computing methodologies → Object recognition.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Generalized Zero-Shot Learning using Generated Proxy Unseen Samples and Entropy Separation

Gune

Banerjee

Chaudhuri

et al. 2020

Proceedings of the 28th ACM International Conference on Multimedia

View full text Add to dashboard Cite

show abstract

“…Recently, a popular approach of zero-shot classification is generating synthesized features for unseen categories. For example, the method in [44] first generated features using word embeddings and random vectors, which was further improved by later works [7,22,28,40,45]. These zero-shot classification methods generated image features without involving contextual information.…”

Section: Related Workmentioning

confidence: 99%

Context-aware Feature Generation For Zero-shot Semantic Segmentation

Zhou

Niu

et al. 2020

Proceedings of the 28th ACM International Conference on Multimedia

View full text Add to dashboard Cite

Existing semantic segmentation models heavily rely on dense pixelwise annotations. To reduce the annotation pressure, we focus on a challenging task named zero-shot semantic segmentation, which aims to segment unseen objects with zero annotations. This task can be accomplished by transferring knowledge across categories via semantic word embeddings. In this paper, we propose a novel context-aware feature generation method for zero-shot segmentation named CaGNet. In particular, with the observation that a pixel-wise feature highly depends on its contextual information, we insert a contextual module in a segmentation network to capture the pixel-wise contextual information, which guides the process of generating more diverse and context-aware features from semantic word embeddings. Our method achieves state-of-the-art results on three benchmark datasets for zero-shot segmentation. CCS CONCEPTS • Computing methodologies → Image segmentation.

show abstract

A review on multimodal zero‐shot learning

Cao

Sun

et al. 2023

WIREs Data Min & Knowl

View full text Add to dashboard Cite

Multimodal learning provides a path to fully utilize all types of information related to the modeling target to provide the model with a global vision. Zero-shot learning (ZSL) is a general solution for incorporating prior knowledge into data-driven models and achieving accurate class identification. The combination of the two, known as multimodal ZSL (MZSL), can fully exploit the advantages of both technologies and is expected to produce models with greater generalization ability. However, the MZSL algorithms and applicationshave not yet been thoroughly investigated and summarized. This study fills this gap by providing an objective overview of MZSL's definition, typical algorithms, representative applications, and critical issues. This article will not only provide researchers in this field with a comprehensive perspective, but it will also highlight several promising research directions.

show abstract

Out-Of-Distribution Detection for Generalized Zero-Shot Action Recognition

Cited by 116 publications

References 27 publications

Generalized Zero-Shot Learning using Generated Proxy Unseen Samples and Entropy Separation

Generalized Zero-Shot Learning using Generated Proxy Unseen Samples and Entropy Separation

Context-aware Feature Generation For Zero-shot Semantic Segmentation

A review on multimodal zero‐shot learning

Contact Info

Product

Resources

About