The natural habitat of most Bayesian methods is data represented by exchangeable sequences of observations, for which de Finetti's theorem provides the theoretical foundation. Dirichlet process clustering, Gaussian process regression, and many other parametric and nonparametric Bayesian models fall within the remit of this framework; many problems arising in modern data analysis do not. This article provides an introduction to Bayesian models of graphs, matrices, and other data that can be modeled by random structures. We describe results in probability theory that generalize de Finetti's theorem to such data and discuss their relevance to nonparametric Bayesian modeling. With the basic ideas in place, we survey example models available in the literature; applications of such models include collaborative filtering, link prediction, and graph and network analysis. We also highlight connections to recent developments in graph theory and probability, and sketch the more general mathematical foundation of Bayesian methods for other types of data beyond sequences and arrays.
Sparse exchangeable graphs on R + , and the associated graphex framework for sparse graphs, generalize exchangeable graphs on N, and the associated graphon framework for dense graphs. We develop the graphex framework as a tool for statistical network analysis by identifying the sampling scheme that is naturally associated with the models of the framework, formalizing two natural notions of consistent estimation of the parameter (the graphex) underlying these models, and identifying general consistent estimators in each case. The sampling scheme is a modification of independent vertex sampling that throws away vertices that are isolated in the sampled subgraph. The estimators are variants of the empirical graphon estimator, which is known to be a consistent estimator for the distribution of dense exchangeable graphs; both can be understood as graph analogues to the empirical distribution in the i.i.d. sequence setting. Our results may be viewed as a generalization of consistent estimation via the empirical graphon from the dense graph regime to also include sparse graphs.
Abstract. The standard approach to program transformation involves the use of discrete logical reasoning to prove that the transformation does not change the observable semantics of the program. We propose a new approach that, in contrast, uses probabilistic reasoning to justify the application of transformations that may change, within probabilistic accuracy bounds, the result that the program produces. Our new approach produces probabilistic guarantees of the form P(|D| ≥ B) ≤ , ∈ (0, 1), where D is the difference between the results that the transformed and original programs produce, B is an acceptability bound on the absolute value of D, and is the maximum acceptable probability of observing large |D|. We show how to use our approach to justify the application of loop perforation (which transforms loops to execute fewer iterations) to a set of computational patterns.
Understanding generalization in deep learning is arguably one of the most important questions in deep learning. Deep learning has been successfully adopted to a large number of problems ranging from pattern recognition to complex decision making, but many recent researchers have raised many concerns about deep learning, among which the most important is generalization. Despite numerous attempts, conventional statistical learning approaches have yet been able to provide a satisfactory explanation on why deep learning works. A recent line of works aims to address the problem by trying to predict the generalization performance through complexity measures. In this competition, we invite the community to propose complexity measures that can accurately predict generalization of models. A robust and general complexity measure would potentially lead to a better understanding of deep learning's underlying mechanism and behavior of deep models on unseen data, or shed light on better generalization bounds. All these outcomes will be important for making deep learning more robust and reliable. * Lead organizer: Yiding Jiang; Scott Yak and Pierre Foret help implement large portion of the infrastructure and the remaining organizers' order is randomized.
Buffer overflow vulnerabilities are caused by programming errors that allow an attacker to cause the program to write beyond the bounds of an allocated memory block to corrupt other data structures. The standard way to exploit a buffer overflow vulnerability involves a request that is too large for the buffer intended to hold it. The buffer overflow error causes the program to write part of the request beyond the bounds of the buffer, corrupting the address space of the program and causing the program to execute injected code contained in the request.We have implemented a compiler that inserts dynamic checks into the generated code to detect all out of bounds memory accesses. When it detects an out of bounds write, it stores the value away in a hash table to return as the value for corresponding out of bounds reads. The net effect is to (conceptually) give each allocated memory block unbounded size and to eliminate out of bounds accesses as a programming error.We have acquired several widely used open source servers (Apache, Sendmail, Pine, Mutt, and Midnight Commander). With standard compilers, all of these servers are vulnerable to buffer overflow attacks as documented at security tracking web sites. Our compiler eliminates these security vulnerabilities (as well as other memory errors). Our results show that our compiler enables the servers to execute successfully through buffer overflow attacks to continue to correctly service user requests without security vulnerabilities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.