Vision-language integration using constrained local semantic features

Tamaazousti, Youssef; Borgne, Hervé Le; Popescu, Adrian; Gadeski, Etienne; Gînsca, Alexandru-Lucian; Hudelot, Céline

doi:10.1016/j.cviu.2017.05.017

Cited by 4 publications

(5 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Three kinds of categories are added: specific [3], [4], [37], [59] (e.g. rottweiler), generic [28], [31], [49] (e.g. dog) and noisy [25], [51].…”

Section: Learning One Network On a Modified Spmentioning

confidence: 99%

“…We propose a new method that takes advantage of the principle of the re-training of neural networks on the same problems, and thus does not need more data [3], [31], [49], nor increasing the network capacity [1], [44], [47], [54]. Our approach relates to the work of [58] who proposes an extensive study of the effect of different self-training methods (i.e., re-training a neural network on the same problem it was trained originally).…”

Section: Focused Self Fine-tunningmentioning

confidence: 99%

“…This principle, although not explicitly described as a SPV, have been already applied in the literature. For instance, variations on the image set while preserving the set of categories have been proposed in [21], [43]; adding variations has been used by [4], [37], [38] (by adding data labeled among specific categories) and [31], [49] (by adding data labeled among generic categories); splitting variations on the set of categories has been performed in [1], [3], [22], [55], [56] and finally, variation of the categories by grouping them has been explored in [6], [23], [42]. However, note that a SPV can be applied for different goals and as explained in the beginning of this section, our goal is to learn different features than those learned on the initial SP, but complementary when combined together.…”

Section: Spv: Source-problem Variationmentioning

confidence: 99%

“…• SPV gen A [28], [31], [49]: Same as the previous method but with a generic adding SPV that adds 100 generic categories (with their 100K images). This results in training a network on 583 specific and generic classes.…”

Section: Comparison To the State-of-the-artmentioning

confidence: 99%

See 3 more Smart Citations

Learning More Universal Representations for Transfer-Learning

Tamaazousti

Borgne

Hudelot

et al. 2020

IEEE Trans. Pattern Anal. Mach. Intell.

Self Cite

View full text Add to dashboard Cite

A representation is supposed universal if it encodes any element of the visual world (e.g., objects, scenes) in any configuration (e.g., scale, context). While not expecting pure universal representations, the goal in the literature is to improve the universality level, starting from a representation with a certain level. To do so, the state-of-the-art consists in learning CNN-based representations on a diversified training problem (e.g., ImageNet modified by adding annotated data). While it effectively increases universality, such approach still requires a large amount of efforts to satisfy the needs in annotated data. In this work, we propose two methods to improve universality, but pay special attention to limit the need of annotated data. We also propose a unified framework of the methods based on the diversifying of the training problem. Finally, to better match Atkinson's cognitive study about universal human representations, we proposed to rely on the transfer-learning scheme as well as a new metric to evaluate universality. This latter, aims us to demonstrates the interest of our methods on 10 target-problems, relating to the classification task and a variety of visual domains.

show abstract

“…Three kinds of categories are added: specific [3], [4], [37], [59] (e.g. rottweiler), generic [28], [31], [49] (e.g. dog) and noisy [25], [51].…”

Section: Learning One Network On a Modified Spmentioning

confidence: 99%

Section: Focused Self Fine-tunningmentioning

confidence: 99%

Section: Spv: Source-problem Variationmentioning

confidence: 99%

Section: Comparison To the State-of-the-artmentioning

confidence: 99%

See 2 more Smart Citations

Learning More Universal Representations for Transfer-Learning

Tamaazousti

Borgne

Hudelot

et al. 2020

IEEE Trans. Pattern Anal. Mach. Intell.

Self Cite

View full text Add to dashboard Cite

show abstract

“…Multimodal corpora. Many corpora provide images with associated textual content, in particular for the tasks of automatic image annotation (Young et al, 2014;Ginsca et al, 2015), cross-media retrieval (Karpathy and Fei-Fei, 2015;Tran et al, 2016a), image-sentence matching (Hodosh et al, 2013;Ordonez et al, 2011), text illustration (Feng and Lapata, 2010;Chami et al, 2017) and cross-media classification (Tran et al, 2016b;Tamaazousti et al, 2017). Most corpora used in this context consist in images with captions from Flickr (Ordonez et al, 2011;Hodosh et al, 2013;Young et al, 2014) or using Amazon's Mechanical Turk (Rashtchian et al, 2010;Lin et al, 2014).…”

Section: Related Workmentioning

confidence: 99%

Multimodal Entity Linking for Tweets

Adjali

Besançon

Ferret

et al. 2020

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

In many information extraction applications, entity linking (EL) has emerged as a crucial task that allows leveraging information about named entities from a knowledge base. In this paper, we address the task of multimodal entity linking (MEL), an emerging research field in which textual and visual information is used to map an ambiguous mention to an entity in a knowledge base (KB). First, we propose a method for building a fully annotated Twitter dataset for MEL, where entities are defined in a Twitter KB. Then, we propose a model for jointly learning a representation of both mentions and entities from their textual and visual contexts. We demonstrate the effectiveness of the proposed model by evaluating it on the proposed dataset and highlight the importance of leveraging visual information when it is available.

show abstract

Semantic change analysis of Korean verbs based on massive culture corpus data

Lou

Zhang

2019

Pers Ubiquit Comput

View full text Add to dashboard Cite

Vision-language integration using constrained local semantic features

Cited by 4 publications

References 6 publications

Learning More Universal Representations for Transfer-Learning

Learning More Universal Representations for Transfer-Learning

Multimodal Entity Linking for Tweets

Semantic change analysis of Korean verbs based on massive culture corpus data

Contact Info

Product

Resources

About