2022
DOI: 10.1038/s41467-022-32358-1
|View full text |Cite
|
Sign up to set email alerts
|

Splicing QTL analysis focusing on coding sequences reveals mechanisms for disease susceptibility loci

Abstract: Splicing quantitative trait loci (sQTLs) are one of the major causal mechanisms in genome-wide association study (GWAS) loci, but their role in disease pathogenesis is poorly understood. One reason is the complexity of alternative splicing events producing many unknown isoforms. Here, we propose two approaches, namely integration and selection, for this complexity by focusing on protein-structure of isoforms. First, we integrate isoforms with the same coding sequence (CDS) and identify 369-601 integrated-isofo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

3
18
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
3

Relationship

3
7

Authors

Journals

citations
Cited by 29 publications
(28 citation statements)
references
References 55 publications
3
18
0
Order By: Relevance
“…Genetic variants (single nucleotide polymorphisms [SNPs]) could be regulators of the identified ES events. In particular, SNPs within ES exons and their flanking introns can change splicing patterns by remodeling the binding affinities of splicing factors; such variants are termed sQTLs [ 31 ]. Through integrative analysis of WGS and RNA-Seq, we revealed 418 significant sQTLs associated with total of 44 ES events (44% of the total set) observed from frontal and temporal regions ( Supplementary Table 3 ).…”
Section: Resultsmentioning
confidence: 99%
“…Genetic variants (single nucleotide polymorphisms [SNPs]) could be regulators of the identified ES events. In particular, SNPs within ES exons and their flanking introns can change splicing patterns by remodeling the binding affinities of splicing factors; such variants are termed sQTLs [ 31 ]. Through integrative analysis of WGS and RNA-Seq, we revealed 418 significant sQTLs associated with total of 44 ES events (44% of the total set) observed from frontal and temporal regions ( Supplementary Table 3 ).…”
Section: Resultsmentioning
confidence: 99%
“…The obtained fastq files were aligned to the GRCh38 primary assembly using minimap2 v2.17 with reference to the splice junctions in the GENCODE38 annotation. We used the flair pipeline ( 70 ) to identify the full-length of the novel transcripts and filtered them using the following criteria: 1) isoforms expressing more than 50 reads in total, 2) isoforms whose 5′ end was located within 100 bp from the FANTOM CAGE peak (TSS peak based on a relaxed 0.14 threshold by TSS classifier), 3) isoforms whose 3′ end is located within 100 bp from the TES of PolyASite2.0, and 4) isoforms evaluated as protein coding isoforms by CPAT v3.0.4 (coding probability ≥ 0.364) ( 71 ).…”
Section: Methodsmentioning
confidence: 99%
“…Inaccurate isoform quantification is partially caused by incomplete reference datasets we use for isoform quantification [ 48 ]. For example, some disease-causing isoforms have incomplete coding sequences in the GENCODE annotation [ 49 ]. Furthermore, even if all constituent exons are identified, complete isoform reconstruction from short-read data remains challenging [ 50 ].…”
Section: Introductionmentioning
confidence: 99%