2021
DOI: 10.1371/journal.pgen.1009303
|View full text |Cite
|
Sign up to set email alerts
|

Creating artificial human genomes using generative neural networks

Abstract: Generative models have shown breakthroughs in a wide spectrum of domains due to recent advancements in machine learning algorithms and increased computational power. Despite these impressive achievements, the ability of generative models to create realistic synthetic data is still under-exploited in genetics and absent from population genetics. Yet a known limitation in the field is the reduced access to many genetic databases due to concerns about violations of individual privacy, although they would provide … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
123
0
1

Year Published

2021
2021
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 77 publications
(142 citation statements)
references
References 50 publications
1
123
0
1
Order By: Relevance
“…The use of GANs in population genetics is just beginning. Recently, Yelmen et al ( 2021 ) created a GAN that generates artificial genomes that mirror the properties of real genomes. Their approach does not include an evolutionary model, so the resulting artificial genomes are ‘unlabelled’.…”
Section: Introductionmentioning
confidence: 99%
“…The use of GANs in population genetics is just beginning. Recently, Yelmen et al ( 2021 ) created a GAN that generates artificial genomes that mirror the properties of real genomes. Their approach does not include an evolutionary model, so the resulting artificial genomes are ‘unlabelled’.…”
Section: Introductionmentioning
confidence: 99%
“…For instance, when playing the Atari game, the computer or agent playing this game was positively rewarded when the outcome of the game was positive based on the actions performed. The algorithm was able to learn some of the games to a level where it performed better than humans (Patterson & Gibson, 2017 (Nielsen & Voigt, 2018 ); sequences (Linder et al, 2019;Liu et al, 2020;Yelmen et al, 2021 ); singlecell RNA sequencing data (Grønbech et al, 2020;Liu et al, 2020;Marouf et al, 2020 ); protein sequences (Repecka et al, 2021;Sinai et al, 2017 ); promoter sequences (Y, ; high-resolution Hi-C data (Hong et al, 2020;Liu et al, 2019b;Liu et al, 2020 ); among others. Also, VAE in GS has been applied to visualize population structure (Battey et al, 2021).…”
Section: Why Can Research In DL Be Applied To Gs?mentioning
confidence: 99%
“…These methods are very efficient at creating fake images (or text) that, for humans, are identical to the real ones. In biology, these methods are being used to generate artificial genomes; fake DNA, such as microbial genomes (Nielsen & Voigt, 2018 ); sequences (Linder et al, 2019;Liu et al, 2020;Yelmen et al, 2021 ); singlecell RNA sequencing data (Grønbech et al, 2020;Liu et al, 2020;Marouf et al, 2020 ); protein sequences (Repecka et al, 2021;Sinai et al, 2017 ); promoter sequences (Y, ; high-resolution Hi-C data (Hong et al, 2020;Liu et al, 2019b;Liu et al, 2020 ); among others. Also, VAE in GS has been applied to visualize population structure (Battey et al, 2021).…”
Section: Why Can Research In DL Be Applied To Gs?mentioning
confidence: 99%
“…1) ). In this regard, researchers have utilized the concept of generative modeling, applying for example Restricted Boltzmann Machines (RBMs) (Hinton et al, 2006[ 43 ]) or Generative Adversarial Networks (GANs) (Goodfellow et al, 2014[ 35 ]; Yelmen et al, 2021[ 81 ]). Though not with the goal to create entire synthetic datasets, variational autoencoders (VAEs) (Kingma and Welling, 2013[ 46 ]) have been used to impute missing counts in single-cell expression data (Eraslan et al, 2019[ 25 ]; Qiu et al, 2020[ 61 ]) as well as in bulk-RNA sequencing and methylome data (Qiu et al, 2020[ 61 ]).…”
Section: Current Research and Future Visionmentioning
confidence: 99%