2016
DOI: 10.1098/rsos.160275
|View full text |Cite
|
Sign up to set email alerts
|

Size distribution of function-based human gene sets and the split–merge model

Abstract: The sizes of paralogues—gene families produced by ancestral duplication—are known to follow a power-law distribution. We examine the size distribution of gene sets or gene families where genes are grouped by a similar function or share a common property. The size distribution of Human Gene Nomenclature Committee (HGNC) gene sets deviate from the power-law, and can be fitted much better by a beta rank function. We propose a simple mechanism to break a power-law size distribution by a combination of splitting an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
1

Relationship

2
4

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 95 publications
(124 reference statements)
0
3
0
Order By: Relevance
“…Since the introduction of the two-parameter Discrete Generalized Beta Distribution (DGBD) (or Beta-like Rank Function or Cocho Rank Function) [1,2], a wide range of real-life data have been successfully fitted by this function [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25]. Two questions naturally arise: first, what's the corresponding probability density function (pdf) of the DGBD?…”
Section: Introductionmentioning
confidence: 99%
“…Since the introduction of the two-parameter Discrete Generalized Beta Distribution (DGBD) (or Beta-like Rank Function or Cocho Rank Function) [1,2], a wide range of real-life data have been successfully fitted by this function [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25]. Two questions naturally arise: first, what's the corresponding probability density function (pdf) of the DGBD?…”
Section: Introductionmentioning
confidence: 99%
“…[3436]) which have parameters that influence which sizes of modules they can detect (as discussed, for example, in [37]). Finally, as discussed in [38], knowing the module size distribution can improve the null models used for gene set enrichment analyses. We believe these advantages to hold also in the particular case of modules being studied in their capacity as building blocks.…”
Section: Introductionmentioning
confidence: 99%
“…To model the formation of SAUs, we implement a version of the split–merge process , a computational mechanism in which administrative units are created and destroyed by joining and dividing them, thereby emulating the role of governments delineating internal boundaries. This process was originally proposed in [ 45 ].…”
Section: Introductionmentioning
confidence: 99%