Ramón Viñas scite author profile

Motivation High-throughput gene expression can be used to address a wide range of fundamental biological problems, but datasets of an appropriate size are often unavailable. Moreover, existing transcriptomics simulators have been criticised because they fail to emulate key properties of gene expression data. In this paper, we develop a method based on a conditional generative adversarial network to generate realistic transcriptomics data for E. coli and humans. We assess the performance of our approach across several tissues and cancer types. Results We show that our model preserves several gene expression properties significantly better than widely used simulators such as SynTReN or GeneNetWeaver. The synthetic data preserves tissue and cancer-specific properties of transcriptomics data. Moreover, it exhibits real gene clusters and ontologies both at local and global scales, suggesting that the model learns to approximate the gene expression manifold in a biologically meaningful way. Availability Code is available at: https://github.com/rvinas/adversarial-gene-expression Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

Deep Learning Enables Fast and Accurate Imputation of Gene Expression

Viñas

Azevedo

Gamazon

et al. 2021

Front. Genet.

View full text Add to dashboard Cite

A question of fundamental biological significance is to what extent the expression of a subset of genes can be used to recover the full transcriptome, with important implications for biological discovery and clinical application. To address this challenge, we propose two novel deep learning methods, PMI and GAIN-GTEx, for gene expression imputation. In order to increase the applicability of our approach, we leverage data from GTEx v8, a reference resource that has generated a comprehensive collection of transcriptomes from a diverse set of human tissues. We show that our approaches compare favorably to several standard and state-of-the-art imputation methods in terms of predictive performance and runtime in two case studies and two imputation scenarios. In comparison conducted on the protein-coding genes, PMI attains the highest performance in inductive imputation whereas GAIN-GTEx outperforms the other methods in in-place imputation. Furthermore, our results indicate strong generalization on RNA-Seq data from 3 cancer types across varying levels of missingness. Our work can facilitate a cost-effective integration of large-scale RNA biorepositories into genomic studies of disease, with high applicability across diverse tissue types.

show abstract

Adversarial generation of gene expression data

Viñas

Andrés-Terré

Lió

et al. 2019

Preprint

View full text Add to dashboard Cite

The problem of reverse engineering gene regulatory networks from high-throughput expression data is one of the biggest challenges in bioinformatics. In order to benchmark network inference algorithms, simulators of well-characterized expression datasets are often required. However, existing simulators have been criticized because they fail to emulate key properties of gene expression data.In this study we address two problems. First, we propose mechanisms to faithfully assess the realism of a synthetic gene expression dataset. Second, we design an adversarial simulator of expression data, gGAN, based on a Generative Adversarial Network. We show that our model outperforms existing simulators by a large margin, achieving realism scores that are up to 17 times higher than those of GeneNetWeaver and SynTReN. More importantly, our results show that gGAN is, to our best knowledge, the first simulator that passes the Turing test for gene expression data proposed by Maier et al. (2013).

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ramón Viñas

Single Nucleotide Polymorphism relevance learning with Random Forests for Type 2 diabetes risk prediction

Adversarial generation of gene expression data

Deep Learning Enables Fast and Accurate Imputation of Gene Expression

Adversarial generation of gene expression data

Contact Info

Product

Resources

About