Nimrod Dorfman scite author profile

Early in development, infants learn to solve visual problems that are highly challenging for current computational methods. We present a model that deals with two fundamental problems in which the gap between computational difficulty and infant learning is particularly striking: learning to recognize hands and learning to recognize gaze direction. The model is shown a stream of natural videos and learns without any supervision to detect human hands by appearance and by context, as well as direction of gaze, in complex natural scenes. The algorithm is guided by an empirically motivated innate mechanism-the detection of "mover" events in dynamic images, which are the events of a moving image region causing a stationary region to move or change after contact. Mover events provide an internal teaching signal, which is shown to be more effective than alternative cues and sufficient for the efficient acquisition of hand and gaze representations. The implications go beyond the specific tasks, by showing how domain-specific "proto concepts" can guide the system to acquire meaningful concepts, which are significant to the observer but statistically inconspicuous in the sensory input.A basic question in cognitive development is how we learn to understand the world on the basis of sensory perception and active exploration. Already in their first months of life, infants rapidly learn to recognize complex objects and events in their visual input (1-3). Probabilistic learning models, as well as connectionist and dynamical models, have been developed in recent years as powerful tools for extracting the unobserved causes of sensory signals (4-6). Some of these models can efficiently discover significant statistical regularities in the observed signals, which may be subtle and of high order, and use them to construct world models and guide behavior (7-10). However, even powerful statistical models have inherent difficulties with natural cognitive concepts, which depend not only on statistical regularities in the sensory input but also on their significance and meaning to the observer. For example, in learning to understand actions and goals, an important part is identifying the agents' hands, their configuration, and their interactions with objects (1-3). This is an example in which significant and meaningful features can be nonsalient and highly variable and therefore difficult to learn. Our testing shows that current computational methods for general object detection (11-13) applied to large training data do not result by themselves in automatically learning about hands. In contrast, detecting hands (14), paying attention to what they are doing (15, 16), and using them to make inferences and predictions (1-3, 17) are natural for humans and appear early in development. How is it possible for infants to acquire such concepts in early development?A large body of developmental studies has suggested that the human cognitive system is equipped through evolution with basic innate structures that facilitate the acquisition of meaningf...

show abstract

They saw a movie: Long-term memory for an extended audiovisual narrative

Furman¹,

Dorfman²,

Hasson³

et al. 2007

Learn. Mem.

View full text Add to dashboard Cite

We measured long-term memory for a narrative film. During the study session, participants watched a 27-min movie episode, without instructions to remember it. During the test session, administered at a delay ranging from 3 h to 9 mo after the study session, long-term memory for the movie was probed using a computerized questionnaire that assessed cued recall, recognition, and metamemory of movie events sampled ∼20 sec apart. The performance of each group of participants was measured at a single time point only. The participants remembered many events in the movie even months after watching it. Analysis of performance, using multiple measures, indicates differences between recent (weeks) and remote (months) memory. While high-confidence recognition performance was a reliable index of memory throughout the measured time span, cued recall accuracy was higher for relatively recent information. Analysis of different content elements in the movie revealed differential memory performance profiles according to time since encoding. We also used the data to propose lower limits on the capacity of long-term memory. This experimental paradigm is useful not only for the analysis of behavioral performance that results from encoding episodes in a continuous real-life-like situation, but is also suitable for studying brain substrates and processes of real-life memory using functional brain imaging.Experimental protocols that probe brain correlates of episodic memory formation commonly use paradigms in which memoranda are presented as individual items devoid of continuous context outside of the laboratory setting (Winocur and Weiskrantz 1976;Buckner et al. 2000). In contrast, real-life episodic memory is the result of ongoing encoding within a highly contextualized and dynamically changing perceptual, cognitive, and affective framework (Tulving 1983(Tulving , 2002Suddendorf and Busby 2005). Though the importance of real-life conditions in memory research has long been recognized (Neisser 1978;Cohen 1996), it is rather difficult to harness its naturalistic attributes in controlled, reproducible laboratory settings (Dudai 2002). Using movies as stimulus material can remedy some of these difficulties.Movies are capable of simulating aspects of real-life experiences by fusing multimodal perception with emotional and cognitive overtones (Eisenstein 1969;Morin 2005). They also permit controlled, reproducible presentation of continuous, contextualized, and dynamic sets of stimuli to-be-remembered, and selection of cognitive and affective types of content. The use of cinematic material to probe memory can be traced to the early days of cinema (Boring 1916), but did not catch on, a few exceptions notwithstanding (Beckner et al. 2006). Realizing the potential advantage of movies as multimodal stimuli on-the-go, Hasson et al. (2004) used a trade fiction movie to analyze brain circuits that process perceptual and affective information while attending the ongoing cinematic narrative, and unveiled correlated spatiotemporal brain activation pa...

show abstract

Amygdalar circuits required for either consolidation or extinction of taste aversion memory are not required for reconsolidation

Bahar

Dorfman

Dudai

2004

Eur J of Neuroscience

View full text Add to dashboard Cite

Recent reports have revitalized the debate on whether, for each item in memory, consolidation occurs just once, or whether, upon their activation in retrieval, items in memory undergo reconsolidation. Further, it has been recently reported that following retrieval in the absence of reinforcer, the activated memory can either reconsolidate or extinguish, depending on the training history. This raises the question whether consolidation, extinction and reconsolidation share neuronal mechanisms, and moreover, whether reconsolidation recapitulates consolidation. In conditioned taste aversion (CTA), consolidation depends on protein synthesis in the central nucleus of the amygdala, whereas extinction depends on protein synthesis in the basolateral nuclei of the amygdala. Here we show that inhibition of protein synthesis in either of these nuclei has no effect on CTA memory under conditions that initiate reconsolidation. This implies that reconsolidation does not recapitulate consolidation, and that consolidation, reconsolidation and extinction are different processes.

show abstract

Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics

Kansky¹,

Silver²,

Mély³

et al. 2017

Preprint

View full text Add to dashboard Cite

The recent adaptation of deep neural networkbased methods to reinforcement learning and planning domains has yielded remarkable progress on individual tasks.Nonetheless, progress on task-to-task transfer remains limited. In pursuit of efficient and robust generalization, we introduce the Schema Network, an objectoriented generative physics simulator capable of disentangling multiple causes of events and reasoning backward through causes to achieve goals. The richly structured architecture of the Schema Network can learn the dynamics of an environment directly from data. We compare Schema Networks with Asynchronous Advantage Actor-Critic and Progressive Networks on a suite of Breakout variations, reporting results on training efficiency and zero-shot generalization, consistently demonstrating faster, more robust learning and better transfer. We argue that generalizing from limited data and learning causal relationships are essential abilities on the path toward generally intelligent systems.

show abstract

A model for discovering ‘containment’ relations

2019

View full text Add to dashboard Cite

Rapid developments in the fields of learning and object recognition have been obtained by successfully developing and using methods for learning from a large number of labeled image examples. However, such current methods cannot explain infants’ learning of new concepts based on their visual experience, in particular, the ability to learn complex concepts without external guidance, as well as the natural order in which related concepts are acquired. A remarkable example of early visual learning is the category of 'containers' and the notion of ‘containment’. Surprisingly, this is one of the earliest spatial relations to be learned, starting already around 3 month of age, and preceding other common relations (e.g., ‘support’, ‘in-between’). In this work we present a model, which explains infants’ capacity of learning ‘containment’ and related concepts by ‘just looking’, together with their empirical development trajectory. Learning occurs in the model fast and without external guidance, relying only on perceptual processes that are present in the first months of life. Instead of labeled training examples, the system provides its own internal supervision to guide the learning process. We show how the detection of so-called ‘paradoxical occlusion’ provides natural internal supervision, which guides the system to gradually acquire a range of useful containment-related concepts. Similar mechanisms of using implicit internal supervision can have broad application in other cognitive domains as well as artificial intelligent systems, because they alleviate the need for supplying extensive external supervision, and because they can guide the learning process to extract concepts that are meaningful to the observer, even if they are not by themselves obvious, or salient in the input.

show abstract

Reinforcement active learning in the vibrissae system: Optimal object localization

Gordon

Dorfman

Ahissar

2013

Journal of Physiology-Paris

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nimrod Dorfman

From simple innate biases to complex visual concepts

They saw a movie: Long-term memory for an extended audiovisual narrative

Amygdalar circuits required for either consolidation or extinction of taste aversion memory are not required for reconsolidation

Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics

A model for discovering ‘containment’ relations

Reinforcement active learning in the vibrissae system: Optimal object localization

Contact Info

Product

Resources

About