We recently proposed frequent itemset mining (FIM) as a method to perform an optimized search for patterns of synchronous spikes (item sets) in massively parallel spike trains. This search outputs the occurrence count (support) of individual patterns that are not trivially explained by the counts of any superset (closed frequent item sets). The number of patterns found by FIM makes direct statistical tests infeasible due to severe multiple testing. To overcome this issue, we proposed to test the significance not of individual patterns, but instead of their signatures, defined as the pairs of pattern size z and support c. Here, we derive in detail a statistical test for the significance of the signatures under the null hypothesis of full independence (pattern spectrum filtering, PSF) by means of surrogate data. As a result, injected spike patterns that mimic assembly activity are well detected, yielding a low false negative rate. However, this approach is prone to additionally classify patterns resulting from chance overlap of real assembly activity and background spiking as significant. These patterns represent false positives with respect to the null hypothesis of having one assembly of given signature embedded in otherwise independent spiking activity. We propose the additional method of pattern set reduction (PSR) to remove these false positives by conditional filtering. By employing stochastic simulations of parallel spike trains with correlated activity in form of injected spike synchrony in subsets of the neurons, we demonstrate for a range of parameter settings that the analysis scheme composed of FIM, PSF and PSR allows to reliably detect active assemblies in massively parallel spike trains.
We present a regression technique for data-driven problems based on polynomial chaos expansion (PCE). PCE is a popular technique in the field of uncertainty quantification (UQ),where it is typically used to replace a runnable but expensive computational model subject to random inputs with an inexpensive-to-evaluate polynomial function. The metamodel obtained enables a reliable estimation of the statistics of the output, provided that a suitable probabilistic model of the input is available.Machine learning (ML) regression is a research field that focuses on providing purely datadriven input-output maps, with the focus on pointwise prediction accuracy. We show that a PCE metamodel purely trained on data can yield pointwise predictions whose accuracy is comparable to that of other ML regression models, such as neural networks and support vector machines. The comparisons are performed on benchmark datasets available from the literature. The methodology also enables the quantification of the output uncertainties, and is robust to noise. Furthermore, it enjoys additional desirable properties, such as good performance for small training sets and simplicity of construction, with only little parameter tuning required.
The computational role of spike time synchronization at millisecond precision among neurons in the cerebral cortex is hotly debated. Studies performed on data of limited size provided experimental evidence that low-order correlations occur in relation to behavior. Advances in electrophysiological technology to record from hundreds of neurons simultaneously provide the opportunity to observe coordinated spiking activity of larger populations of cells. We recently published a method that combines data mining and statistical evaluation to search for significant patterns of synchronous spikes in massively parallel spike trains (Torre et al., 2013). The method solves the computational and multiple testing problems raised by the high dimensionality of the data. In the current study, we used our method on simultaneous recordings from two macaque monkeys engaged in an instructed-delay reach-to-grasp task to determine the emergence of spike synchronization in relation to behavior. We found a multitude of synchronous spike patterns aligned in both monkeys along a preferential mediolateral orientation in brain space. The occurrence of the patterns is highly specific to behavior, indicating that different behaviors are associated with the synchronization of different groups of neurons ("cell assemblies"). However, pooled patterns that overlap in neuronal composition exhibit no specificity, suggesting that exclusive cell assemblies become active during different behaviors, but can recruit partly identical neurons. These findings are consistent across multiple recording sessions analyzed across the two monkeys.
Systems subject to uncertain inputs produce uncertain responses. Uncertainty quantification (UQ) deals with the estimation of statistics of the system response, given a computational model of the system and a probabilistic model of its inputs. In engineering applications it is common to assume that the inputs are mutually independent or coupled by a Gaussian or elliptical dependence structure (copula).In this paper we overcome such limitations by modelling the dependence structure of multivariate inputs as vine copulas. Vine copulas are models of multivariate dependence built from simpler pair-copulas. The vine representation is flexible enough to capture complex dependencies. This paper formalises the framework needed to build vine copula models of multivariate inputs and to combine them with virtually any UQ method. The framework allows for a fully automated, data-driven inference of the probabilistic input model on available input data.The procedure is exemplified on two finite element models of truss structures, both subject to inputs with non-Gaussian dependence structures. For each case, we analyse the moments of the model response (using polynomial chaos expansions), and perform a structural reliability analysis to calculate the probability of failure of the system (using the first order reliability method and importance sampling). Reference solutions are obtained by Monte Carlo simulation. The results show that, while the Gaussian assumption yields biased statistics, the vine copula representation achieves significantly more precise estimates, even when its structure needs to be fully inferred from a limited amount of observations. Uncertainty Quantification (UQ) estimates statistics of the response of a system subject to stochastic inputs. The system is usually described by a deterministic computational model M (e.g., a finite element code). The input consists of M possibly coupled parameters, modelled by a random vector X with joint cumulative distribution function (CDF) F X and probability density (PDF) f X . The computational model transforms X into an uncertain output Y = M(X), which here we take to be a univariate random variable. The extension to multivariate outputs is straightforward.Of interest in UQ problems are various statistics of Y , such as its CDF F Y , its moments, the probability of extreme events (i.e., of small or large quantiles), the sensitivity of Y to the different components X i of X, and others. Because M is typically a complex model which is not known explicitly, analytical solutions are in general not available. The model behavior can only be known point-wise in correspondence with inputs x (j) sampled from F X , where it produces responses y (j) = M(x (j) ) (non-intrusive, or black-box approach). The classical and most general strategy to solve this class of problems is by Monte Carlo simulation (MCS).MCS draws the x (j) as i.i.d samples from F X , which requires the sample size n to be large enough to cover the input probability space sufficiently well. When M is computation...
With the ability to observe the activity from large numbers of neurons simultaneously using modern recording technologies, the chance to identify sub-networks involved in coordinated processing increases. Sequences of synchronous spike events (SSEs) constitute one type of such coordinated spiking that propagates activity in a temporally precise manner. The synfire chain was proposed as one potential model for such network processing. Previous work introduced a method for visualization of SSEs in massively parallel spike trains, based on an intersection matrix that contains in each entry the degree of overlap of active neurons in two corresponding time bins. Repeated SSEs are reflected in the matrix as diagonal structures of high overlap values. The method as such, however, leaves the task of identifying these diagonal structures to visual inspection rather than to a quantitative analysis. Here we present ASSET (Analysis of Sequences of Synchronous EvenTs), an improved, fully automated method which determines diagonal structures in the intersection matrix by a robust mathematical procedure. The method consists of a sequence of steps that i) assess which entries in the matrix potentially belong to a diagonal structure, ii) cluster these entries into individual diagonal structures and iii) determine the neurons composing the associated SSEs. We employ parallel point processes generated by stochastic simulations as test data to demonstrate the performance of the method under a wide range of realistic scenarios, including different types of non-stationarity of the spiking activity and different correlation structures. Finally, the ability of the method to discover SSEs is demonstrated on complex data from large network simulations with embedded synfire chains. Thus, ASSET represents an effective and efficient tool to analyze massively parallel spike data for temporal sequences of synchronous activity.
Repeated, precise sequences of spikes are largely considered a signature of activation of cell assemblies. These repeated sequences are commonly known under the name of spatio-temporal patterns (STPs). STPs are hypothesized to play a role in the communication of information in the computational process operated by the cerebral cortex. A variety of statistical methods for the detection of STPs have been developed and applied to electrophysiological recordings, but such methods scale poorly with the current size of available parallel spike train recordings (more than 100 neurons). In this work, we introduce a novel method capable of overcoming the computational and statistical limits of existing analysis techniques in detecting repeating STPs within massively parallel spike trains (MPST). We employ advanced data mining techniques to efficiently extract repeating sequences of spikes from the data. Then, we introduce and compare two alternative approaches to distinguish statistically significant patterns from chance sequences. The first approach uses a measure known as conceptual stability, of which we investigate a computationally cheap approximation for applications to such large data sets. The second approach is based on the evaluation of pattern statistical significance. In particular, we provide an extension to STPs of a method we recently introduced for the evaluation of statistical significance of synchronous spike patterns. The performance of the two approaches is evaluated in terms of computational load and statistical power on a variety of artificial data sets that replicate specific features of experimental data. Both methods provide an effective and robust procedure for detection of STPs in MPST data. The method based on significance evaluation shows the best overall performance, although at a higher computational cost. We name the novel procedure the spatio-temporal Spike PAttern Detection and Evaluation (SPADE) analysis.
Temporally, precise correlations between simultaneously recorded neurons have been interpreted as signatures of cell assemblies, i.e., groups of neurons that form processing units. Evidence for this hypothesis was found on the level of pairwise correlations in simultaneous recordings of few neurons. Increasing the number of simultaneously recorded neurons increases the chances to detect cell assembly activity due to the larger sample size. Recent technological advances have enabled the recording of 100 or more neurons in parallel. However, these massively parallel spike train data require novel statistical tools to be analyzed for correlations, because they raise considerable combinatorial and multiple testing issues. Recently, various of such methods have started to develop. First approaches were based on population or pairwise measures of synchronization, and later led to methods for the detection of various types of higher-order synchronization and of spatio-temporal patterns. The latest techniques combine data mining with analysis of statistical significance. Here, we give a comparative overview of these methods, of their assumptions and of the types of correlations they can detect.Electronic supplementary materialThe online version of this article (10.1007/s00422-018-0755-0) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.