2014
DOI: 10.1186/1471-2105-15-149
|View full text |Cite
|
Sign up to set email alerts
|

FIGG: Simulating populations of whole genome sequences for heterogeneous data analyses

Abstract: BackgroundHigh-throughput sequencing has become one of the primary tools for investigation of the molecular basis of disease. The increasing use of sequencing in investigations that aim to understand both individuals and populations is challenging our ability to develop analysis tools that scale with the data. This issue is of particular concern in studies that exhibit a wide degree of heterogeneity or deviation from the standard reference genome. The advent of population scale sequencing studies requires anal… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
8
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(8 citation statements)
references
References 30 publications
0
8
0
Order By: Relevance
“…Hundreds of genome simulation methods and tools have been developed (Peng et al, 2018), which can be divided into three broad groups: (1) coalescent simulators for population genomes evolving under particular evolutionary models (Carvajal-Rodríguez, 2008), such as GENOME (Liang et al, 2007), GeneEvolve (Tahmasbi and Keller, 2017), and SFS_CODE (Uricchio et al, 2015); (2) simulation tools for case-control GWAS data, such as simGWA (Yang and Gu, 2013), simGWAS (Fortune and Wallace, 2018), GWAsimulator (Li and Li, 2007), and TraidSim (Shi et al, 2018); and (3) simulators for various types of genome variants and sequences, such as FIGG, simuG, VST, VarSim, Xome-Blender, and SVEngine. FIGG generates large numbers of whole genomes with known sequence characteristics based on the direct sampling of experimentally known or theorized variations (Killcoyne and del Sol, 2014). simuG simulates SNPs, Indels, CNVs, Inversions, and Translocations for different organisms (Yue and Liti, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…Hundreds of genome simulation methods and tools have been developed (Peng et al, 2018), which can be divided into three broad groups: (1) coalescent simulators for population genomes evolving under particular evolutionary models (Carvajal-Rodríguez, 2008), such as GENOME (Liang et al, 2007), GeneEvolve (Tahmasbi and Keller, 2017), and SFS_CODE (Uricchio et al, 2015); (2) simulation tools for case-control GWAS data, such as simGWA (Yang and Gu, 2013), simGWAS (Fortune and Wallace, 2018), GWAsimulator (Li and Li, 2007), and TraidSim (Shi et al, 2018); and (3) simulators for various types of genome variants and sequences, such as FIGG, simuG, VST, VarSim, Xome-Blender, and SVEngine. FIGG generates large numbers of whole genomes with known sequence characteristics based on the direct sampling of experimentally known or theorized variations (Killcoyne and del Sol, 2014). simuG simulates SNPs, Indels, CNVs, Inversions, and Translocations for different organisms (Yue and Liti, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…A third option is simulating the experiment in silico (7,8). For example, a computer text file representing a mutated genome can be created and WGS reads can be simulated and analyzed in the same manner as real reads to predict, for instance, whether a certain read depth (RD) will be suitable for a particular purpose.…”
Section: Introductionmentioning
confidence: 99%
“…The rapid development of novel software tools has increasingly facilitated the simulation of Next Generation Sequencing (NGS) experiments (reviewed in 17). Simulated experiments performed to date include those simulating DNA structural variation (18), RNA-sequencing (RNA-seq) differential-expression studies (19,20), bisulfite sequencing (21), studies based on tumor sequencing data (22) and data from Quantitative Trait Loci (QTL) analysis (23), sequencing of heterogeneous populations (7), and de novo genome assembly (8). A combination of simulations and pilot experiments can also be performed.…”
Section: Introductionmentioning
confidence: 99%
“…Most of these articles are also in the Hadoop search results discussed above. We found four HBase articles; of which two [40,42] describe software that used HBase as a storage backend for sequencing data. The remaining two mention HBase in related work.…”
Section: Articles About Specific Data-intensive Computing Systemsmentioning
confidence: 99%
“…We examined 15 articles published between November 2013 and November 2014. Of these, eight described software run on a single server or a desktop computer, six used a cluster for either file storage or as a computational resource, and one system [40] used data-intensive computing techniques. These results indicate that dataintensive computing systems are not widely used for biological data analysis infrastructures.…”
Section: Data-intensive Computing Articles In Bmc Bioinformaticsmentioning
confidence: 99%