A useful definition of 'big data' is data that is too big to process comfortably on a single machine, either because of processor, memory, or disk bottlenecks. Graphics processing units can alleviate the processor bottleneck, but memory or disk bottlenecks can only be eliminated by splitting data across multiple machines. Communication between large numbers of machines is expensive (regardless of the amount of data being communicated), so there is a need for algorithms that perform distributed approximate Bayesian analyses with minimal communication. Consensus Monte Carlo operates by running a separate Monte Carlo algorithm on each machine, and then averaging individual Monte Carlo draws across machines. Depending on the model, the resulting draws can be nearly indistinguishable from the draws that would have been obtained by running a single-machine algorithm for a very long time. Examples of consensus Monte Carlo are shown for simple models where single-machine solutions are available, for large single-layer hierarchical models, and for Bayesian additive regression trees (BART). AbstractA useful definition of "big data" is data that is too big to comfortably process on a single machine, either because of processor, memory, or disk bottlenecks. Graphics processing units can alleviate the processor bottleneck, but memory or disk bottlenecks can only be eliminated by splitting data across multiple machines. Communication between large numbers of machines is expensive (regardless of the amount of data being communicated), so there is a need for algorithms that perform distributed approximate Bayesian analyses with minimal communication. Consensus Monte Carlo operates by running a separate Monte Carlo algorithm on each machine, and then averaging individual Monte Carlo draws across machines. Depending on the model, the resulting draws can be nearly indistinguishable from the draws that would have been obtained by running a single machine algorithm for a very long time. Examples of consensus Monte Carlo are shown for simple models where single-machine solutions are available, for large single-layer hierarchical models, and for Bayesian additive regression trees (BART).
Methods of approximate Bayesian computation (ABC) are increasingly used for analysis of complex models. A major challenge for ABC is overcoming the often inherent problem of high rejection rates in the accept/reject methods based on prior:predictive sampling. A number of recent developments aim to address this with extensions based on sequential Monte Carlo (SMC) strategies. We build on this here, introducing an ABC SMC method that uses data-based adaptive weights. This easily implemented and computationally trivial extension of ABC SMC can very substantially improve acceptance rates, as is demonstrated in a series of examples with simulated and real data sets, including a currently topical example from dynamic modelling in systems biology applications.
In studies of dynamic molecular networks in systems biology, experiments are increasingly exploiting technologies such as flow cytometry to generate data on marginal distributions of a few network nodes at snapshots in time. For example, levels of intracellular expression of a few genes, or cell surface protein markers, can be assayed at a series of interim time points and assumed steady-states under experimentally stimulated growth conditions in small cellular systems. Such marginal data on a small number of cellular markers will typically carry very limited information on the parameters and structure of dynamic network models, though experiments will typically be designed to expose variation in cellular phenotypes that are inherently related to some aspects of model parametrization and structure. Our work addresses statistical questions of how to integrate such data with dynamic stochastic models in order to properly quantify the information-or lack of information-it carries relative to models assumed. We present a Bayesian computational strategy coupled with a novel approach to summarizing and numerically characterizing biological phenotypes that are represented in terms of the resulting sample distributions of cellular markers. We build on Bayesian simulation methods and mixture modeling to define the approach to linking mechanistic mathematical models of network dynamics to snapshot data, using a toggle switch example integrating simulated and real data as context.
Abstract. In research situations usually approached by Decision Theory, it is only considered one researcher who collects a sample and makes a decision based on it. It can be shown that randomization of the sample does not improve the utility of the obtained results. Nevertheless, we present situations in which this approach is not satisfactory. First, we present a case in which randomization can be an important tool in order to achieve agreement between people with different opinions. Next, we present another situation in which there are two agents: the researcher -a person who collects the sample; and the decision-maker -a person who makes decisions based on the sample collected. We show that problems emerge when the decision-maker allows the researcher to arbitrarily choose a sample. We also show that the decision-maker maximizes his expected utility requiring that the sample is collected randomly.
Primeiramente, agradeço a meus pais, Edna e Flavio, pelo apoio constante e incondicional para alcance das minhas realizações. Agradeço a minha família como um todo, por formarem uma estrutura sólida, que orgulhosamente tenho como referência em minha vida pessoal e profissional. Além dos meus pais, fazem parte desse grupo: meus avós, tios, primos, afilhada Isabela, irmã Camila e madrinha Yone. Registro meu sincero agradecimento aos amigos que estiveram próximos, e que me supriram com força e inspiração para superar os desafios dessa fase: irmãos jacobianos (Davi, Denis, Emerson, Hommenig, Luis e Paulo), a grande amiga Bia (TFPOTW), e os amigos do IME-USP (colegas de Graduação e Mestrado, pessoal da Seção de Alunos, CPG, Sec-Mae, Sandrão, Sylvia do CEA,. . .). Manifesto-me grato aos Professores do IME-USP que participaram mais proximamente da minha formação acadêmica na graduação. Neste grupo estão: Beti Kira, Fabio Prates, Denise Botter, Mônica Sandoval e Luis Gustavo. Pela formação matemática no ensino fundamental, agradeçoà Professora Fátima. Aos Professores da banca, Dani Gamerman e Claudia Peixoto, agradeço pelas correções sugeridas para este trabalho. O mesmo tipo de agradecimento se aplica aos Professores (e tios) Marcelo e Márcio. E preciso ainda registrar meu agradecimento para três pessoas fundamentais na minha trajetória no Mestrado. São eles os Professores Carlinhos e Julio Stern, e o colega Rafael Bassi. Aos Professores, agradeço por orientações, amizade e apoio. Ao Rafael, agradeço pela grande amizade e pela contribuição brilhante nos trabalhos vinculados ao Mestrado. Por fim, expresso meu agradecimento especial ao meu mentor, Professor Sergio Wechsler, que além de ter sido um excelente orientador para o projeto, foi um grande amigo que sempre me presenteou com valiosos aconselhamentos, ensinamentos e inspiração.
ABSTRACT. The law of maturity is the belief that less-observed events are becoming mature and, therefore, more likely to occur in the future. Previous studies have shown that the assumption of infinite exchangeability contradicts the law of maturity. In particular, it has been shown that infinite exchangeability contradicts probabilistic descriptions of the law of maturity such as the gambler's belief and the belief in maturity. We show that the weaker assumption of finite exchangeability is compatible with both the gambler's belief and belief in maturity. We provide sufficient conditions under which these beliefs hold under finite exchangeability. These conditions are illustrated with commonly used parametric models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.