2022
DOI: 10.1186/s12859-022-04895-5
|View full text |Cite
|
Sign up to set email alerts
|

Transforming L1000 profiles to RNA-seq-like profiles with deep learning

Abstract: The L1000 technology, a cost-effective high-throughput transcriptomics technology, has been applied to profile a collection of human cell lines for their gene expression response to > 30,000 chemical and genetic perturbations. In total, there are currently over 3 million available L1000 profiles. Such a dataset is invaluable for the discovery of drug and target candidates and for inferring mechanisms of action for small molecules. The L1000 assay only measures the mRNA expression of 978 landmark genes while… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(10 citation statements)
references
References 42 publications
0
5
0
Order By: Relevance
“…Subsequently, Chen et al . (2016) and Jeon et al . (2022) proposed neural learning approaches using the dimensionality reduction permitted by the L1000 genes to achieve gene expression inference via multilayer perceptrons (MLPs) architectures.…”
Section: Diffusion-based Generation Methodsmentioning
confidence: 97%
See 1 more Smart Citation
“…Subsequently, Chen et al . (2016) and Jeon et al . (2022) proposed neural learning approaches using the dimensionality reduction permitted by the L1000 genes to achieve gene expression inference via multilayer perceptrons (MLPs) architectures.…”
Section: Diffusion-based Generation Methodsmentioning
confidence: 97%
“…Building upon the methodology used to deploy and assess data augmentation with VAEs and GANs (Lacan et al, 2023), the proposed contribution is a gene expression generation pipeline leveraging the power of diffusion models. The critical issues related to the high dimensionality of transcriptomics data are handled by considering the so-called 1,000 landmark genes (Subramanian et al, 2017): the data generation is conducted in this reduced 1,000-dimension space, and the generated samples are mapped to the remaining genes space using a trained linear or non-linear approach (Chen et al, 2016;Jeon et al, 2022). To the best of our knowledge, this is the first success in applying the famed diffusion models on bulk RNA data.…”
Section: Introductionmentioning
confidence: 99%
“…The top 500 genes show the highest correlation with drug response were selected as features for training the machine learning model. Landmark-1000 (L1000) gene set is known to be reproducible and capable of inferring expression levels of the majority of other genes ( Subramanian et al 2017 , Jeon et al 2022 ). This gene set is frequently used for characterizing biological samples ( Malta et al 2018 , Wan et al 2020 ) and machine learning-based drug response prediction ( Gardiner et al 2020 , Lu et al 2021 , Uner et al 2023 ).…”
Section: Methodsmentioning
confidence: 99%
“…Enrichment analysis for GO biological processes with differentially expressed proteins (FDR < 0.05, logFC > 0.58) was done utilizing the R package clusterProfiler. Drug prediction was done utilizing the LINCS L1000 characteristic direction signatures search engine ( https://maayanlab.cloud/L1000CDS2/#/index ) with upregulated and downregulated proteins as input ( 52 ).…”
Section: Methodsmentioning
confidence: 99%