2020
DOI: 10.1016/j.ygeno.2020.01.009
|View full text |Cite
|
Sign up to set email alerts
|

Establishment of an eHAP1 human haploid cell line hybrid reference genome assembled from short and long reads

Abstract: Background: Haploid cell lines are a valuable research tool with broad applicability for genetic assays. As such the fully haploid human cell line, eHAP1, has been used in a wide array of studies. However, the absence of a corresponding reference genome sequence for this cell line has limited the potential for more widespread applications to experiments dependent on available sequence, like capture-clone methodologies.Results: We generated ~15x coverage Nanopore long reads from ten GridION flowcells. We utiliz… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 46 publications
(63 reference statements)
0
2
0
Order By: Relevance
“…To assess the expression of specific NFE2L2 transcripts in A549 cell line, we have used the RNA sequencing data from the project available at NCBI Gene Expression Omnibus (GEO) public database (accession numbers GSM2308412, where A549 cell line was sequenced with paired Illumina protocol. Primary analysis of RNA-seq data included the quality control of sequenced reads with the use of FastQC (Andrews, 2010), reads trimming with the usage of Trimmomatic [ 16 ] and mapping to the reference genome based on NCBI reference human genome (assembly GRCh38.p13) and annotation (release 109) [ 17 ] with the Hisat2 aligner [ 18 ]. Further data preparations was performed with SAMtools software [ 19 ] and R software [ 20 ] together with Bioconductor platform.…”
Section: Methodsmentioning
confidence: 99%
“…To assess the expression of specific NFE2L2 transcripts in A549 cell line, we have used the RNA sequencing data from the project available at NCBI Gene Expression Omnibus (GEO) public database (accession numbers GSM2308412, where A549 cell line was sequenced with paired Illumina protocol. Primary analysis of RNA-seq data included the quality control of sequenced reads with the use of FastQC (Andrews, 2010), reads trimming with the usage of Trimmomatic [ 16 ] and mapping to the reference genome based on NCBI reference human genome (assembly GRCh38.p13) and annotation (release 109) [ 17 ] with the Hisat2 aligner [ 18 ]. Further data preparations was performed with SAMtools software [ 19 ] and R software [ 20 ] together with Bioconductor platform.…”
Section: Methodsmentioning
confidence: 99%
“…The second dataset used was a WGS Illumina paired-end sequence reads of Drosophila melanogaster with 4 Gb of size per file 65 and accessible in SRA repository with SRR10735526 accession number. Homo sapiens genome 62 3.1 Gb GCA_000001405.28…”
Section: Assembly Quality Testmentioning
confidence: 99%