Anthony J. Bell scite author profile

We derive a new self-organizing learning algorithm that maximizes the information transferred in a network of nonlinear units. The algorithm does not assume any knowledge of the input distributions, and is defined here for the zero-noise limit. Under these conditions, information maximization has extra properties not found in the linear case (Linsker 1989). The nonlinearities in the transfer function are able to pick up higher-order moments of the input distributions and perform something akin to true redundancy reduction between units in the output representation. This enables the network to separate statistically independent components in the inputs: a higher-order generalization of principal components analysis. We apply the network to the source separation (or cocktail party) problem, successfully separating unknown mixtures of up to 10 speakers. We also show that a variant on the network architecture is able to perform blind deconvolution (cancellation of unknown echoes and reverberation in a speech signal). Finally, we derive dependencies of information transfer on time delays. We suggest that information maximization provides a unifying framework for problems in "blind" signal processing.

show abstract

The “independent components” of natural scenes are edge filters

Bell

Sejnowski²

1997

Vision Research

1,864

1,345

View full text Add to dashboard Cite

It has previously been suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and it has been reasoned that such responses should emerge from an unsupervised learning algorithm that attempts to find a factorial code of independent visual features. We show here that a new unsupervised learning algorithm based on information maximization, a nonlinear "infomax" network, when applied to an ensemble of natural scenes produces sets of visual filters that are localized and oriented. Some of these filters are Gabor-like and resemble those produced by the sparseness-maximization network. In addition, the outputs of these filters are as independent as possible, since this infomax network performs Independent Components Analysis or ICA, for sparse (super-gaussian) component distributions. We compare the resulting ICA filters and their associated basis functions, with other decorrelating filters produced by Principal Components Analysis (PCA) and zero-phase whitening filters (ZCA). The ICA filters have more sparsely distributed (kurtotic) outputs on natural scenes. They also resemble the receptive fields of simple cells in visual cortex, which suggests that these neurons form a natural, information-theoretic coordinate system for natural images.

show abstract

Analysis of fMRI data by blind separation into independent spatial components

et al. 1998

View full text Add to dashboard Cite

Current analytical techniques applied to functional magnetic resonance imaging (fMRI) data require a priori knowledge or specific assumptions about the time courses of processes contributing to the measured signals. Here we describe a new method for analyzing fMRI data based on the independent component analysis (ICA) algorithm of Bell and Sejnowski ([1995]: Neural Comput 7:1129-1159). We decomposed eight fMRI data sets from 4 normal subjects performing Stroop color-naming, the Brown and Peterson work/number task, and control tasks into spatially independent components. Each component consisted of voxel values at fixed three-dimensional locations (a component "map"), and a unique associated time course of activation. Given data from 144 time points collected during a 6-min trial, ICA extracted an equal number of spatially independent components. In all eight trials, ICA derived one and only one component with a time course closely matching the time course of 40-sec alternations between experimental and control tasks. The regions of maximum activity in these consistently task-related components generally overlapped active regions detected by standard correlational analysis, but included frontal regions not detected by correlation. Time courses of other ICA components were transiently task-related, quasiperiodic, or slowly varying. By utilizing higher-order statistics to enforce successively stricter criteria for spatial independence between component maps, both the ICA algorithm and a related fourth-order decomposition technique (Comon [1994]: Signal Processing 36:11-20) were superior to principal component analysis (PCA) in determining the spatial and temporal extent of task-related activation. For each subject, the time courses and active regions of the task-related ICA components were consistent across trials and were robust to the addition of simulated noise. Simulated movement artifact and simulated task-related activations added to actual fMRI data were clearly separated by the algorithm. ICA can be used to distinguish between nontask-related signal components, movements, and other artifacts, as well as consistently or transiently task-related fMRI activations, based on only weak assumptions about their spatial distributions and without a priori assumptions about their time courses. ICA appears to be a highly promising method for the analysis of fMRI data from normal and clinical populations, especially for uncovering unpredictable transient patterns of brain activity associated with performance of psychomotor tasks.

show abstract

Blind separation of auditory event-related brain responses into independent components

Makeig

Jung

Bell

et al. 1997

Proc. Natl. Acad. Sci. U.S.A.

1,048

652

View full text Add to dashboard Cite

Averaged event-related potential (ERP) data recorded from the human scalp reveal electroencephalographic (EEG) activity that is reliably time-locked and phaselocked to experimental events. We report here the application of a method based on information theory that decomposes one or more ERPs recorded at multiple scalp sensors into a sum of components with fixed scalp distributions and sparsely activated, maximally independent time courses. Independent component analysis (ICA) decomposes ERP data into a number of components equal to the number of sensors. The derived components have distinct but not necessarily orthogonal scalp projections. Unlike dipole-fitting methods, the algorithm does not model the locations of their generators in the head. Unlike methods that remove second-order correlations, such as principal component analysis (PCA), ICA also minimizes higherorder dependencies. Applied to detected-and undetectedtarget ERPs from an auditory vigilance experiment, the algorithm derived ten components that decomposed each of the major response peaks into one or more ICA components with relatively simple scalp distributions. Three of these components were active only when the subject detected the targets, three other components only when the target went undetected, and one in both cases. Three additional components accounted for the steady-state brain response to a 39-Hz background click train. Major features of the decomposition proved robust across sessions and changes in sensor number and placement. This method of ERP analysis can be used to compare responses from multiple stimuli, task conditions, and subject states.Although the locations of the brain areas generating eventrelated potentials (ERPs) cannot be uniquely determined by scalp recordings from any number of channels (1), several methods have been proposed for decomposing evoked responses into activations of distinct neural sources. Most of these also attempt to locate the active areas, by assuming either that they have a known or simple spatial configuration (2) or that generators are restricted to a small subset of possible locations and orientations (3). Other methods based on rotations of principal components use optimization criteria not directly related to brain anatomy and physiology. These methods may assume that each response component has the same time course of activation in every experimental condition (4). All these methods use second-order spatiotemporal correlations to perform the decomposition.Here we report a statistical method for decomposing one or more event-related brain responses into a sum of components with spatially fixed scalp distributions and maximally independent (though possibly overlapping) time courses. Independence requires the absence of higher-order as well as secondorder correlations between the time courses. Independence, therefore, is a stronger condition than decorrelation and, in particular, is not satisfied by decomposition into principal components by principal component analysis (PCA).Although the ...

show abstract

Imaging brain dynamics using independent component analysis

et al. 2001

View full text Add to dashboard Cite

show abstract

Thermodynamics of Prediction

et al. 2012

View full text Add to dashboard Cite

A system responding to a stochastic driving signal can be interpreted as computing, by means of its dynamics, an implicit model of the environmental variables. The system's state retains information about past environmental fluctuations, and a fraction of this information is predictive of future ones. The remaining nonpredictive information reflects model complexity that does not improve predictive power, and thus represents the ineffectiveness of the model. We expose the fundamental equivalence between this model inefficiency and thermodynamic inefficiency, measured by the energy dissipated during the interaction between system and environment. Our results hold arbitrarily far from thermodynamic equilibrium and are applicable to a wide range of systems, including biomolecular machines. They highlight a profound connection between the effective use of information and efficient thermodynamic operation: any system constructed to keep memory about its environment and to operate with maximal energetic efficiency has to be predictive.

show abstract

A unifying information-theoretic framework for independent component analysis

Lee

Girolami

Bell

et al. 2000

Computers & Mathematics with Applications

281

184

View full text Add to dashboard Cite

Blind Separation of Time-Delayed and Convolved Sources

Lee

Bell

Lambert³

1998

108

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anthony J. Bell

An Information-Maximization Approach to Blind Separation and Blind Deconvolution

The “independent components” of natural scenes are edge filters

Analysis of fMRI data by blind separation into independent spatial components

Blind separation of auditory event-related brain responses into independent components

Imaging brain dynamics using independent component analysis

Thermodynamics of Prediction

A unifying information-theoretic framework for independent component analysis

Blind Separation of Time-Delayed and Convolved Sources

Contact Info

Product

Resources

About