“…Using human descriptions of objects together with deictic hand gestures, researchers can train a grounding system for identifying referent objects (Perera & Allen, 2013;Matuszek, Bo, Zettlemoyer, & Fox, 2014;Whitney, Eldon, Oberlin, & Tellex, 2016;Whitney, Rosen, MacGlashan, Wong, & Tellex, 2017;Pizzuto, Hospedales, Capirci, & Cangelosi, 2019). Other researchers have focused on learning categorical properties of objects (red ) together with relational (taller ) and differentiating (differ by weight) properties of objects by exploring them with a robotic arm (Sinapov, Schenck, & Stoytchev, 2014b).…”