2019
DOI: 10.1146/annurev-biodatasci-072018-021339
|View full text |Cite
|
Sign up to set email alerts
|

Molecular Heterogeneity in Large-Scale Biological Data: Techniques and Applications

Abstract: High-throughput sequencing technologies have evolved at a stellar pace for almost a decade and have greatly advanced our understanding of genome biology. In these sampling-based technologies, there is an important detail that is often overlooked in the analysis of the data and the design of the experiments, specifically that the sampled observations often do not give a representative picture of the underlying population. This has long been recognized as a problem in statistical ecology and in the broader stati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 148 publications
(131 reference statements)
0
5
0
Order By: Relevance
“…In this paper, we present an overview of BNP inference for SSPs under the popular PYP prior. Motivated by the work of Deng et al [2019], we focus on SSPs corresponding to the aforementioned questions Q1, Q2 and Q3, respectively, which have been proved to be of practical interest also beyond biological sciences. Regarding Q1, we consider the estimation of coverage probabilities, which include the missing mass and the coverage probability of order r ≥ 1, namely the probability mass of species observed with frequency r in the sample.…”
Section: Our Contributionsmentioning
confidence: 99%
See 3 more Smart Citations
“…In this paper, we present an overview of BNP inference for SSPs under the popular PYP prior. Motivated by the work of Deng et al [2019], we focus on SSPs corresponding to the aforementioned questions Q1, Q2 and Q3, respectively, which have been proved to be of practical interest also beyond biological sciences. Regarding Q1, we consider the estimation of coverage probabilities, which include the missing mass and the coverage probability of order r ≥ 1, namely the probability mass of species observed with frequency r in the sample.…”
Section: Our Contributionsmentioning
confidence: 99%
“…In genomics data, they appear in relation to coverage depth, i.e. the average number of reads that are aligned to known reference bases [Deng et al, 2019]. Depending on the specific application, different levels of coverage might be required.…”
Section: Coverages Of Prevalencesmentioning
confidence: 99%
See 2 more Smart Citations
“…missing mass, discovery probabilities, unseen species with prevalences and coverages of prevalence. We refer to Deng et al (2019) and Balocchi et al (2022) for up-to-date reviews of SSPs, both in methods and applications, mostly in the field of biological sciences but also in machine learning, electrical engineering, theoretical computer science and information theory.…”
Section: Introductionmentioning
confidence: 99%