2020
DOI: 10.1186/s13293-020-00312-9
|View full text |Cite
|
Sign up to set email alerts
|

Reference genome and transcriptome informed by the sex chromosome complement of the sample increase ability to detect sex differences in gene expression from RNA-Seq data

Abstract: Background: Human X and Y chromosomes share an evolutionary origin and, as a consequence, sequence similarity. We investigated whether the sequence homology between the X and Y chromosomes affects the alignment of RNA-Seq reads and estimates of differential expression. We tested the effects of using reference genomes and reference transcriptomes informed by the sex chromosome complement of the sample's genome on the measurements of RNA-Seq abundance and sex differences in expression. Results: The default genom… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
29
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8
1
1

Relationship

2
8

Authors

Journals

citations
Cited by 33 publications
(29 citation statements)
references
References 66 publications
(113 reference statements)
0
29
0
Order By: Relevance
“…Filtered reads were mapped using HISAT2 (v2.1.0) ( 84 ) to the C. purpureus R40 genome (autosomes and V sex chromosome) concatenated with the GG1 U sex chromosome. We hard masked the U chromosome for males, and the V for females, using Bedtools (v2.27.1) ( 76 ) maskfasta ( 85 ). Genes greater than 300 bp were assembled using StringTie (v1.3.3) ( 86 ), gene counts were extracted using StringTie’s prepDE.py script ( http://ccb.jhu.edu/software/stringtie/index.shtml?t=manual#deseq ), and gene IDs renamed (using mstrg_prep.pl, https://gist.github.com/gpertea/b83f1b32435e166afa92a2d388527f4b ).…”
Section: Methodsmentioning
confidence: 99%
“…Filtered reads were mapped using HISAT2 (v2.1.0) ( 84 ) to the C. purpureus R40 genome (autosomes and V sex chromosome) concatenated with the GG1 U sex chromosome. We hard masked the U chromosome for males, and the V for females, using Bedtools (v2.27.1) ( 76 ) maskfasta ( 85 ). Genes greater than 300 bp were assembled using StringTie (v1.3.3) ( 86 ), gene counts were extracted using StringTie’s prepDE.py script ( http://ccb.jhu.edu/software/stringtie/index.shtml?t=manual#deseq ), and gene IDs renamed (using mstrg_prep.pl, https://gist.github.com/gpertea/b83f1b32435e166afa92a2d388527f4b ).…”
Section: Methodsmentioning
confidence: 99%
“…This results in sequencing data that contains poor quality mapping of similar regions between the sex chromosomes, and spurious reads mapped to the Y chromosome in samples from XX genomes. A recent study showed that accounting for these artifacts in sequence mapping protocols can improve variant calling (153) and detection of sex differential gene expression (154). The XYalign is a sex-informed sequence alignment method that first identifies whether the sequencing reads derives from an XX or XY genome based on read balance, and then align the sequencing reads to a sex-appropriate reference genome (153).…”
Section: Incorporating Sex In Genomics Researchmentioning
confidence: 99%
“…Short sequencing reads from the X and Y chromosomes may mismap due to the high degree of sequence homology. To account for this mismapping, female samples were mapped to the GENCODE GRCh38 reference genome with the Y-chromosome hard-masked, and male samples were mapped to the GRCH38 reference genome with the Y-chromosomal pseudo-autosomal regions hard-masked [ 19 , 20 ]. Raw read counts were summed across all transcripts of the same gene to give a gene-level expression, and genes were only included if they had a known gene-level annotation.…”
Section: Methodsmentioning
confidence: 99%