Abstract:While modern deep neural architectures generalise well when test data is sampled from the same distribution as training data, they fail badly for cases when the test data distribution differs from the training distribution even along a few dimensions. This lack of out-of-distribution generalisation is increasingly manifested when the tasks become more abstract and complex, such as in relational reasoning. In this paper we propose a neuroscience-inspired inductive-biased module that can be readily amalgamated w… Show more
“…However, these models typically require very large training sets (on the order of 10 6 training examples), and generally fail to generalize outside of the very specific conditions under which they are trained. For example, state-of-the-art performance on the extrapolation regime of the PGM dataset, in which test problems contain feature values outside the range of those observed in the training set, is currently 25.9% [38], and state-of-the-art performance on other out-of-distribution generalization regimes (held-out shape-color, held-out line-type, etc.) is comparably poor [2,37].…”
Section: Raven's Progressive Matrices and Deep Learningmentioning
Human intelligence is characterized by a remarkable ability to infer abstract rules from experience and apply these rules to novel domains. As such, designing neural network algorithms with this capacity is an important step toward the development of deep learning systems with more human-like intelligence. However, doing so is a major outstanding challenge, one that some argue will require neural networks to use explicit symbol-processing mechanisms. In this work, we focus on neural networks' capacity for arbitrary role-filler binding, the ability to associate abstract "roles" to context-specific "fillers," which many have argued is an important mechanism underlying the ability to learn and apply rules abstractly. Using a simplified version of Raven's Progressive Matrices, a hallmark test of human intelligence, we introduce a sequential formulation of a visual problem-solving task that requires this form of binding. Further, we introduce the Emergent Symbol Binding Network (ESBN), a recurrent neural network model that learns to use an external memory as a binding mechanism. This mechanism enables symbol-like variable representations to emerge through the ESBN's training process without the need for explicit symbol-processing machinery. We empirically demonstrate that the ESBN successfully learns the underlying abstract rule structure of our task and perfectly generalizes this rule structure to novel fillers.
“…However, these models typically require very large training sets (on the order of 10 6 training examples), and generally fail to generalize outside of the very specific conditions under which they are trained. For example, state-of-the-art performance on the extrapolation regime of the PGM dataset, in which test problems contain feature values outside the range of those observed in the training set, is currently 25.9% [38], and state-of-the-art performance on other out-of-distribution generalization regimes (held-out shape-color, held-out line-type, etc.) is comparably poor [2,37].…”
Section: Raven's Progressive Matrices and Deep Learningmentioning
Human intelligence is characterized by a remarkable ability to infer abstract rules from experience and apply these rules to novel domains. As such, designing neural network algorithms with this capacity is an important step toward the development of deep learning systems with more human-like intelligence. However, doing so is a major outstanding challenge, one that some argue will require neural networks to use explicit symbol-processing mechanisms. In this work, we focus on neural networks' capacity for arbitrary role-filler binding, the ability to associate abstract "roles" to context-specific "fillers," which many have argued is an important mechanism underlying the ability to learn and apply rules abstractly. Using a simplified version of Raven's Progressive Matrices, a hallmark test of human intelligence, we introduce a sequential formulation of a visual problem-solving task that requires this form of binding. Further, we introduce the Emergent Symbol Binding Network (ESBN), a recurrent neural network model that learns to use an external memory as a binding mechanism. This mechanism enables symbol-like variable representations to emerge through the ESBN's training process without the need for explicit symbol-processing machinery. We empirically demonstrate that the ESBN successfully learns the underlying abstract rule structure of our task and perfectly generalizes this rule structure to novel fillers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.