GLU 2017 International Workshop on Grounding Language Understanding 2017
DOI: 10.21437/glu.2017-15
|View full text |Cite
|
Sign up to set email alerts
|

Finding Regions of Interest from Multimodal Human-Robot Interactions

Abstract: Learning new concepts, such as object models, from humanrobot interactions entails different recognition capabilities on a robotic platform. This work proposes a hierarchical approach to address the extra challenges from natural interaction scenarios by exploiting multimodal data. First, a speech-guided recognition of the type of interaction happening is presented. This first step facilitates the following segmentation of relevant visual information to learn the target object model. Our approach includes three… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 10 publications
0
2
0
Order By: Relevance
“…• Speak: the user describes where a certain object is in relation to other objects. This paper builds on our previous works in [1] and [2]. The new contributions here are:…”
Section: Introductionmentioning
confidence: 90%
See 1 more Smart Citation
“…• Speak: the user describes where a certain object is in relation to other objects. This paper builds on our previous works in [1] and [2]. The new contributions here are:…”
Section: Introductionmentioning
confidence: 90%
“…[59], show that the combination of language and vision leads to substantial improvements. In our previous work [60] we demonstrated that including Speak interactions to train the models obtains better accuracy than using only Point and Show ones.…”
Section: Multimodal Incremental Interaction Recognition Modulementioning
confidence: 99%
“…Oleh karena itu, Dalam penelitian ini, kami merancang sebuah antarmuka virtual untuk mesin kasir digital di Café Lentera Coffee & Eatery menggunakan teknologi Computer Vision dan Convolutional Neural Network (CNN) dengan tujuan meningkatkan pelayanan konsumen menggunakan teknologi kecerdasan buatan [6]. Selain itu, tujuan utama dari penelitian ini adalah untuk mengembangkan teknologi kecerdasan buatan dan juga akan di implementasikan pada antarmuka virtual tersebut untuk mengenali wajah konsumen, mengenali permintaan yang dipesan dan memproses permintaan pelanggan dengan lebih cepat dan efisien [7]. Sistem ini akan menjadi antarmuka virtual antara pelanggan dan mesin kasir, memungkinkan pelanggan untuk memesan secara real-time dan juga bisa membayar tanpa harus P-ISSN: 2089-676X E-ISSN: 2549-0796 990 berinteraksi langsung dengan pelayan dan crew [8] .…”
Section: Pendahuluanunclassified