Toddler-inspired embodied vision for learning object representations

Aubret, Arthur; Teulière, Céline; Triesch, Jochen

doi:10.1109/icdl53763.2022.9962190

Cited by 2 publications

(1 citation statement)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the past, computational models of cognitive development have often been restricted to isolated cognitive phenomena. Some examples are works on binocular vision [63], [64], visual object and category learning [65]- [67], gaze following [68], [69], learning to grasp objects [20], perservative reaching [70], word learning [71]- [73], and countless others. While such models have produced many important insights, they often work with simplified sensory inputs and it is not clear how to scale them to the rich multimodal sensory input provided by our sense organs.…”

Section: Discussionmentioning

confidence: 99%

MIMo: A Multi-Modal Infant Model for Studying Cognitive Development in Humans and AIs

Mattern

López

Ernst

et al. 2022

2022 IEEE International Conference on Development and Learning (ICDL)

View full text Add to dashboard Cite

Human intelligence and human consciousness emerge gradually during the process of cognitive development. Understanding this development is an essential aspect of understanding the human mind and may facilitate the construction of artificial minds with similar properties. Importantly, human cognitive development relies on embodied interactions with the physical and social environment, which is perceived via complementary sensory modalities. These interactions allow the developing mind to probe the causal structure of the world. This is in stark contrast to common machine learning approaches, e.g., for large language models, which are merely passively "digesting" large amounts of training data, but are not in control of their sensory inputs. However, computational modeling of the kind of self-determined embodied interactions that lead to human intelligence and consciousness is a formidable challenge. Here we present MIMo, an open-source multi-modal infant model for studying early cognitive development through computer simulations. MIMo's body is modeled after an 18-month-old child with detailed five-fingered hands. MIMo perceives its surroundings via binocular vision, a vestibular system, proprioception, and touch perception through a full-body virtual skin, while two different actuation models allow control of his body. We describe the design and interfaces of MIMo and provide examples illustrating its use.

show abstract