2021
DOI: 10.1101/2021.05.30.446360
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A generative nonparametric Bayesian model for whole genomes

Abstract: Generative probabilistic modeling of biological sequences has widespread existing and potential use across biology and biomedicine, particularly given advances in high-throughput sequencing, synthesis and editing. However, we still lack methods with nucleotide resolution that are tractable at the scale of whole genomes and that can achieve high predictive accuracy either in theory or practice. In this article we propose a new generative sequence model, the Bayesian embedded autoregressive (BEAR) model, which u… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(12 citation statements)
references
References 52 publications
(88 reference statements)
0
12
0
Order By: Relevance
“…We embedded the AR model (a convolutional neural network) into a BEAR model, and fit the BEAR model with empirical Bayes. We found evidence that the AR model was misspecified on every dataset, following the methodology of Amin, Weinstein and Marks [2]: the optimal h selected by empirical Bayes was on the order of 1 − 10 in each dataset. Now, in the limit as the hyperparameter h →0, the BEAR model collapses to its embedded AR model; so by scanning h from low to high values we can interpolate between the parametric and nonparametric regime.…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…We embedded the AR model (a convolutional neural network) into a BEAR model, and fit the BEAR model with empirical Bayes. We found evidence that the AR model was misspecified on every dataset, following the methodology of Amin, Weinstein and Marks [2]: the optimal h selected by empirical Bayes was on the order of 1 − 10 in each dataset. Now, in the limit as the hyperparameter h →0, the BEAR model collapses to its embedded AR model; so by scanning h from low to high values we can interpolate between the parametric and nonparametric regime.…”
Section: Resultsmentioning
confidence: 99%
“…To accomplish this, we compute a posterior over p 0 using a Bayesian nonparametric sequence model. In particular, we apply the Bayesian embedded autoregressive (BEAR) model, which can be scaled to terabytes of data and satisfies posterior consistency ([2], Thm. 35):…”
Section: Diagnostic Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Moreover, VI is also used in building deep generative models [15,19]. However, contrary to Bayesian phylogenetic inference frameworks, most evolutionary-oriented deep generative models do not explicitly consider the underlying evolutionary dynamics of the biological sequences [1,22,26].…”
Section: Introductionmentioning
confidence: 99%