Second Grand-Challenge and Workshop on Multimodal Language (Challenge-Hml) 2020
DOI: 10.18653/v1/2020.challengehml-1.5
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Online Grounding of Natural Language during Human-Robot Interactions

Abstract: Allowing humans to communicate through natural language with robots requires connections between words and percepts. The process of creating these connections is called symbol grounding and has been studied for nearly three decades. Although many studies have been conducted, not many considered grounding of synonyms and the employed algorithms either work only offline or in a supervised manner. In this paper, a cross-situational learning based grounding framework is proposed that allows grounding of words and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

2
0

Authors

Journals

citations
Cited by 2 publications
(10 citation statements)
references
References 24 publications
0
10
0
Order By: Relevance
“…Therefore, this study proposes and evaluates a hybrid grounding framework that combines both paradigms. More specifically, this study extends a recently proposed unsupervised cross-situational learning based grounding framework (Roesler, 2020b), which has been shown to achieve state-of-the-art grounding results, with a novel interactive learning based mechanism to learn from feedback provided by a tutor. The hypothesis is that the hybrid framework is more sample efficient and produces more accurate groundings than frameworks that use only unsupervised learning, while at the same time being able to work in the absence of supervision, which is not the case for purely supervised frameworks.…”
Section: Introductionmentioning
confidence: 85%
See 1 more Smart Citation
“…Therefore, this study proposes and evaluates a hybrid grounding framework that combines both paradigms. More specifically, this study extends a recently proposed unsupervised cross-situational learning based grounding framework (Roesler, 2020b), which has been shown to achieve state-of-the-art grounding results, with a novel interactive learning based mechanism to learn from feedback provided by a tutor. The hypothesis is that the hybrid framework is more sample efficient and produces more accurate groundings than frameworks that use only unsupervised learning, while at the same time being able to work in the absence of supervision, which is not the case for purely supervised frameworks.…”
Section: Introductionmentioning
confidence: 85%
“…While these findings, i.e., that feedback improves the accuracy and sampleefficiency, seem reasonable and intuitive, the employed crosssituational learning algorithm was very limited, thus, it is not clear whether feedback would have provided the same benefit, if a more sophisticated unsupervised grounding mechanism would have been employed. A different study by Roesler (2020a) extended an unsupervised cross-situational learning based grounding framework, which has achieved state-of-the-art grounding performance (Roesler, 2020b), with a mechanism to learn from explicit teaching and showed that explicit teaching increases the convergence speed towards the correct groundings. The main disadvantage of the employed supervised learning mechanism is that it requires the tutor to artificially create a special teaching situation, which is a simplified version of the environment specifically designed to ensure that the agent will correctly learn a specific mapping.…”
Section: Hybrid Groundingmentioning
confidence: 99%
“…However, most proposed models only work offline, i.e., perceptual data and words need to be collected in advance, and the employed scenarios only contained unambiguous words, i.e., no two words were grounded through the same percept. In contrast, the grounding framework used in this study, which is based on the framework proposed in [38], is able to learn online and in an open-ended manner, i.e., no separate training phase is required, and it is also able to ground synonyms, i.e., words that refer to the same concrete representations in specific context, e.g., "happy" and "cheerful".…”
Section: Related Workmentioning
confidence: 99%
“…The grounding algorithm described in this section requires percepts to be represented through concrete representations. In previous work, concrete representations have been obtained through clustering [38]; however, the clustering algorithms employed in previous studies are not able to achieve accurate clusters for the extracted speech features. Thus, deep neural networks are used instead to obtain the same label for all concrete representations of the same emotion type, emotion intensity or gender (Section 4.2).…”
Section: Language Groundingmentioning
confidence: 99%
See 1 more Smart Citation