2022
DOI: 10.1093/bioinformatics/btac561
|View full text |Cite
|
Sign up to set email alerts
|

GTFtools: a software package for analyzing various features of gene models

Abstract: Motivation Gene-centric bioinformatics studies frequently involve calculation or extraction of various features of genes such as splice sites, promoters, independent introns, and untranslated regions (UTRs) through manipulation of gene models. Gene models are often annotated in gene transfer format (GTF) files. The features are essential for subsequent analysis such as intron retention detection, DNA-binding site identification, and computing splicing strength of splice sites. Some features s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 14 publications
(9 citation statements)
references
References 8 publications
0
8
0
Order By: Relevance
“…2014 ), and then aligned sequences to GRCz11 as above. We used GTFtools ( Li et al . 2022 ) to analyze Ens92 gene models and used the length of merged exons of isoforms of each gene to calculate TPM (transcripts per kilobase of gene length per million total reads) for all protein-coding genes in each library.…”
Section: Methodsmentioning
confidence: 99%
“…2014 ), and then aligned sequences to GRCz11 as above. We used GTFtools ( Li et al . 2022 ) to analyze Ens92 gene models and used the length of merged exons of isoforms of each gene to calculate TPM (transcripts per kilobase of gene length per million total reads) for all protein-coding genes in each library.…”
Section: Methodsmentioning
confidence: 99%
“…These libraries did not have molecular barcodes, so we removed the adapters with Cutadapt (Martin 2011), quality trimmed with Trimmomatic (Bolger et al 2014), and then aligned to GRCz11 as above. We used GTFtools (Li et al 2022) to analyze Ens92 gene models and used the length of merged exons of isoforms of each gene to calculate TPM (transcripts per kilobase of gene length per million total reads) for all protein-coding genes in each library. The GRCz11 repeat annotation was obtained from UCSC (Navarro Gonzalez et al 2021).…”
Section: Methodsmentioning
confidence: 99%
“…Given the large number of accessible peaks and overlapping genes (127,602 peaks and 20,248 genes), we devised an approach to reduce and identify the peaks that could regulate their gene targets expression. For gene expression, we use an approach applied previously 85 to calculate transcript per million (TPM) values by first normalizing the transcript count by the gene length, as calculated using GTFtools 86 , followed by the library size, as calculated using QualiMap (v 2.2.1) 83 , and then carrying out a log2 (x +1) transformation of 1) each genes TPM in each replicate; and 2) the mean TPM for each gene across biological replicates. Each genes mean TPM across the three replicates is used to assess correlations with ATAC signal whereas replicate specific TPM are used to study any differences between replicates.…”
Section: Methodsmentioning
confidence: 99%