2021
DOI: 10.3389/fgene.2021.766405
|View full text |Cite
|
Sign up to set email alerts
|

tascCODA: Bayesian Tree-Aggregated Analysis of Compositional Amplicon and Single-Cell Data

Abstract: Accurate generative statistical modeling of count data is of critical relevance for the analysis of biological datasets from high-throughput sequencing technologies. Important instances include the modeling of microbiome compositions from amplicon sequencing surveys and the analysis of cell type compositions derived from single-cell RNA sequencing. Microbial and cell type abundance data share remarkably similar statistical features, including their inherent compositionality and a natural hierarchical ordering … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2

Relationship

3
2

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 70 publications
(103 reference statements)
0
3
0
Order By: Relevance
“…Univariate statistical models, which analyse change in abundance for each cell type individually, such as Poisson regression or Wilcoxon rank-sum tests, may perceive some cell-type population shifts as statistically sound effects, although they are purely a statistical artefact caused by the compositionality of the data 108 , leading to an elevated FDR. Tests specifically designed for single-cell data that make use of cell-type counts include scDC 109 , scCODA 108 and tascCODA, which can incorporate hierarchical cell-type information 110 .…”
Section: Transcriptomementioning
confidence: 99%
“…Univariate statistical models, which analyse change in abundance for each cell type individually, such as Poisson regression or Wilcoxon rank-sum tests, may perceive some cell-type population shifts as statistically sound effects, although they are purely a statistical artefact caused by the compositionality of the data 108 , leading to an elevated FDR. Tests specifically designed for single-cell data that make use of cell-type counts include scDC 109 , scCODA 108 and tascCODA, which can incorporate hierarchical cell-type information 110 .…”
Section: Transcriptomementioning
confidence: 99%
“…We will explore such modifications in future studies. Moreover, while we chose the Negative Binomial model as base distribution for the most abundant taxa, the variational formulation lends itself to other statistical models for microbial count data, including zero-inflated or hurdle-type extensions of the Negative Binomial model [19] or the Dirichlet-Multinomial model [30, 53]. Finally, in its current state, VI-MIDAS is built on Stan [12] with tailored Python code for optimization, model selection, and analysis.…”
Section: Discussionmentioning
confidence: 99%
“…There exists another class of methods that fit regression models to compositional data for the covariates using a tree-guided regularization in a maximum likelihood (trac - [Bien et al, 2021]) or Bayesian setting (tascCODA - [Ostner et al, 2021]). For trac, the parsimony of feature selection (where more nodes at greater heights result in a lesser number of total nodes) depends on the weight assigned to regularization during model fitting.…”
Section: Introductionmentioning
confidence: 99%