2016
DOI: 10.1145/2885252
|View full text |Cite
|
Sign up to set email alerts
|

Learning to name objects

Abstract: We have seen remarkable recent progress in computational visual recognition, producing systems that can classify objects into thousands of different categories with increasing accuracy. However, one question that has received relatively less attention is "what labels should recognition systems output?" This paper looks at the problem of predicting category labels that mimic how human observers would name objects. This goal is related to the concept of entry-level categories first introduced by psychologists in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(8 citation statements)
references
References 13 publications
0
8
0
Order By: Relevance
“…Therefore, in the following, we report average precision for individual words, namely for those cases where similarity-based regression has the strongest positive or negative effect as compared to binary classification (see Tables 3 and 4 showing average precision scores, number of positive instances of the word in the train and test set, and their semantic neighbours in the vocabulary, according to the vector space). We also look at hypernyms (Table 2) which are not easy to learn in realistic referring expression data as more specific nouns are usually more common or natural (Ordonez et al, 2016).…”
Section: Resultsmentioning
confidence: 99%
“…Therefore, in the following, we report average precision for individual words, namely for those cases where similarity-based regression has the strongest positive or negative effect as compared to binary classification (see Tables 3 and 4 showing average precision scores, number of positive instances of the word in the train and test set, and their semantic neighbours in the vocabulary, according to the vector space). We also look at hypernyms (Table 2) which are not easy to learn in realistic referring expression data as more specific nouns are usually more common or natural (Ordonez et al, 2016).…”
Section: Resultsmentioning
confidence: 99%
“…The reason lies on the fact that many concepts can be similar in one modality but very different in the other, and thus capitalizing on both information turns out to be very effective in many tasks. Evidence supporting this intuition has been provided by several works (Lazaridou et al, 2015;Johnson et al, 2015;Xiong et al, 2016;Ordonez et al, 2016) that developed multimodal models for representing concepts that outperformed both language-based and vision-based models in different tasks. Multimodal representations have been also used for exploring compositionality in visual objects (Vendrov et al, 2015), but compositionality was intended as combining two or more objects in a visual scene (eg., an apple and a banana) and not as obtaining the representation of a new concept based on two or more existing concepts.…”
Section: Related Workmentioning
confidence: 87%
“…Whereas Zarrieß and Schlangen (2016) propose a strategy to avoid object names when the systems confidence is low, we focus on improving the generation of object names, using distributional knowledge as an additional source. Similarly, Ordonez et al (2016) have studied the problem of deriving appropriate object names, or so-called entry-level categories, from the output of an object recognizer. Their approach focusses on linking abstract object categories in ImageNet to actual words via various translation procedures.…”
Section: Related Workmentioning
confidence: 99%
“…This set corresponds to the majority of object names in the corpus: out of the 99.5K available image regions, we use 80K for training and testing. Thus, our experiments are on a smaller scale as compared to (Ordonez et al, 2016). Nevertheless, the data is challenging, as the corpus contains references to objects that fall outside of the object labeling scheme that available object recognition systems are typically optimized for, cf.…”
Section: Task and Datamentioning
confidence: 99%
See 1 more Smart Citation