Tomaso Poggio scite author profile

Visual processing in cortex is classically modeled as a hierarchy of increasingly sophisticated representations, naturally extending the model of simple to complex cells of Hubel and Wiesel. Surprisingly, little quantitative modeling has been done to explore the biological feasibility of this class of models to explain aspects of higher-level visual processing such as object recognition. We describe a new hierarchical model consistent with physiological data from inferotemporal cortex that accounts for this complex visual task and makes testable predictions. The model is based on a MAX-like operation applied to inputs to certain cortical neurons that may have a general role in cortical function.

show abstract

HMDB: A large video database for human motion recognition

Kuehne

et al. 2011

View full text Add to dashboard Cite

With nearly one billion online videos viewed everyday, an emerging new frontier in computer vision research is recognition and search in video. While much effort has been devoted to the collection and annotation of large scalable static image datasets containing thousands of image categories, human action datasets lag far behind. Current action recognition databases contain on the order of ten different action categories collected under fairly controlled conditions. State-of-the-art performance on these datasets is now near ceiling and thus there is a need for the design and creation of new benchmarks. To address this issue we collected the largest action video database to-date with 51 action categories, which in total contain around 7,000 manually annotated clips extracted from a variety of sources ranging from digitized movies to YouTube. We use this database to evaluate the performance of two representative computer vision systems for action recognition and explore the robustness of these methods under various conditions such as camera motion, viewpoint, video quality and occlusion.

show abstract

Robust Object Recognition with Cortex-Like Mechanisms

Serre

Wolf

Bileschi

et al. 2007

IEEE Trans. Pattern Anal. Mach. Intell.

1,461

1,268

View full text Add to dashboard Cite

Abstract-We introduce a new general framework for the recognition of complex visual scenes, which is motivated by biology: We describe a hierarchical system that closely follows the organization of visual cortex and builds an increasingly complex and invariant feature representation by alternating between a template matching and a maximum pooling operation. We demonstrate the strength of the approach on a range of recognition tasks: From invariant single object recognition in clutter to multiclass categorization problems and complex scene understanding tasks that rely on the recognition of both shape-based as well as texture-based objects. Given the biological constraints that the system had to satisfy, the approach performs surprisingly well: It has the capability of learning from only a few training examples and competes with state-of-the-art systems. We also discuss the existence of a universal, redundant dictionary of features that could handle the recognition of most object categories. In addition to its relevance for computer vision, the success of this approach suggests a plausibility proof for a class of feedforward models of object recognition in cortex.

show abstract

Prediction of central nervous system embryonal tumour outcome based on gene expression

Pomeroy

Tamayo

Gaasenbeek

et al. 2002

Nature

2,102

1,164

View full text Add to dashboard Cite

Embryonal tumours of the central nervous system (CNS) represent a heterogeneous group of tumours about which little is known biologically, and whose diagnosis, on the basis of morphologic appearance alone, is controversial. Medulloblastomas, for example, are the most common malignant brain tumour of childhood, but their pathogenesis is unknown, their relationship to other embryonal CNS tumours is debated, and patients' response to therapy is difficult to predict. We approached these problems by developing a classification system based on DNA microarray gene expression data derived from 99 patient samples. Here we demonstrate that medulloblastomas are molecularly distinct from other brain tumours including primitive neuroectodermal tumours (PNETs), atypical teratoid/rhabdoid tumours (AT/RTs) and malignant gliomas. Previously unrecognized evidence supporting the derivation of medulloblastomas from cerebellar granule cells through activation of the Sonic Hedgehog (SHH) pathway was also revealed. We show further that the clinical outcome of children with medulloblastomas is highly predictable on the basis of the gene expression profiles of their tumours at diagnosis.

show abstract

Multiclass cancer diagnosis using tumor gene expression signatures

Ramaswamy

Tamayo

Rifkin

et al. 2001

Proc. Natl. Acad. Sci. U.S.A.

1,740

1,160

View full text Add to dashboard Cite

The optimal treatment of patients with cancer depends on establishing accurate diagnoses by using a complex combination of clinical and histopathological data. In some instances, this task is difficult or impossible because of atypical clinical presentation or histopathology. To determine whether the diagnosis of multiple common adult malignancies could be achieved purely by molecular classification, we subjected 218 tumor samples, spanning 14 common tumor types, and 90 normal tissue samples to oligonucleotide microarray gene expression analysis. The expression levels of 16,063 genes and expressed sequence tags were used to evaluate the accuracy of a multiclass classifier based on a support vector machine algorithm. Overall classification accuracy was 78%, far exceeding the accuracy of random classification (9%). Poorly differentiated cancers resulted in low-confidence predictions and could not be accurately classified according to their tissue of origin, indicating that they are molecularly distinct entities with dramatically different gene expression patterns compared with their well differentiated counterparts. Taken together, these results demonstrate the feasibility of accurate, multiclass molecular cancer classification and suggest a strategy for future clinical implementation of molecular cancer diagnostics.

show abstract

Networks for approximation and learning

1990

View full text Add to dashboard Cite

A feedforward architecture accounts for rapid categorization

Serre

Oliva

Poggio

2007

Proc. Natl. Acad. Sci. U.S.A.

800

862

View full text Add to dashboard Cite

Primates are remarkably good at recognizing objects. The level of performance of their visual system and its robustness to image degradations still surpasses the best computer vision systems despite decades of engineering effort. In particular, the high accuracy of primates in ultra rapid object categorization and rapid serial visual presentation tasks is remarkable. Given the number of processing stages involved and typical neural latencies, such rapid visual processing is likely to be mostly feedforward. Here we show that a specific implementation of a class of feedforward theories of object recognition (that extend the Hubel and Wiesel simple-tocomplex cell hierarchy and account for many anatomical and physiological constraints) can predict the level and the pattern of performance achieved by humans on a rapid masked animal vs. non-animal categorization task.object recognition ͉ computational model ͉ visual cortex ͉ natural scenes ͉ preattentive vision O bject recognition in the cortex is mediated by the ventral visual pathway running from the primary visual cortex (V1) (1) through extrastriate visual areas II (V2) and IV (V4), to the inferotemporal cortex (IT) (2-4), and then to the prefrontal cortex (PFC), which is involved in linking perception to memory and action. Over the last decade, a number of physiological studies in nonhuman primates have established several basic facts about the cortical mechanisms of recognition. The accumulated evidence points to several key features of the ventral pathway. From V1 to IT, there is an increase in invariance to position and scale (1, 2, 4-6) and in parallel, an increase in the size of the receptive fields (2, 4) as well as in the complexity of the optimal stimuli for the neurons (2, 3, 7). Finally, plasticity and learning are probably present at all stages and certainly at the level of IT (6) and PFC.However, an important aspect of the visual architecture, i.e., the role of the anatomical back projections abundantly present between almost all of the areas in the visual cortex, remains a matter of debate. The hypothesis that the basic processing of information is feedforward is supported most directly by the short time spans required for a selective response to appear in IT cells (8). Very recent data (9) show that the activity of small neuronal populations in monkey IT, over very short time intervals (as small as 12.5 ms) and only Ϸ100 ms after stimulus onset, contains surprisingly accurate and robust information supporting a variety of recognition tasks. Although this finding does not rule out local feedback loops within an area, it does suggest that a core hierarchical feedforward architecture may be a reasonable starting point for a theory of visual cortex aiming to explain immediate recognition, the initial phase of recognition before eye movements and high-level processes can play a role (10-13).One of the first feedforward models, Fukushima's Neocognitron (14), followed the basic Hubel and Wiesel proposal (1) for building an increasingly complex and invarian...

show abstract

Face recognition: features versus templates

Brunelli

Poggio

1993

IEEE Trans. Pattern Anal. Machine Intell.

2,035

856

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tomaso Poggio

Hierarchical models of object recognition in cortex

HMDB: A large video database for human motion recognition

Robust Object Recognition with Cortex-Like Mechanisms

Prediction of central nervous system embryonal tumour outcome based on gene expression

Multiclass cancer diagnosis using tumor gene expression signatures

Networks for approximation and learning

A feedforward architecture accounts for rapid categorization

Face recognition: features versus templates

Contact Info

Product

Resources

About