2022
DOI: 10.1186/s12859-022-05087-x
|View full text |Cite
|
Sign up to set email alerts
|

EndHiC: assemble large contigs into chromosome-level scaffolds using the Hi-C links from contig ends

Abstract: Background The application of PacBio HiFi and ultra-long ONT reads have enabled huge progress in the contig-level assembly, but it is still challenging to assemble large contigs into chromosomes with available Hi-C scaffolding tools, which count Hi-C links between contigs using the whole or a large part of contig regions. As the Hi-C links of two adjacent contigs concentrate only at the neighbor ends of the contigs, larger contig size will reduce the power to differentiate adjacent (signal) and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 12 publications
(4 citation statements)
references
References 20 publications
0
4
0
Order By: Relevance
“…The assembly size is similar to the estimated genome size, suggesting high completeness of the genome assembly. Then, 85.2% of these contigs were further anchored into nine pseudochromosomes by EndHiC, 24 with scaffold N50 size of 610.7 Mb ( Table 1 , Supplementary Tables S2 and S3 , Supplementary Fig. S3 ).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The assembly size is similar to the estimated genome size, suggesting high completeness of the genome assembly. Then, 85.2% of these contigs were further anchored into nine pseudochromosomes by EndHiC, 24 with scaffold N50 size of 610.7 Mb ( Table 1 , Supplementary Tables S2 and S3 , Supplementary Fig. S3 ).…”
Section: Resultsmentioning
confidence: 99%
“…To obtain the nine pseudochromosome scaffolds of G. coronaria through proximity ligation data, we mapped the quality-filtered Hi-C sequencing reads to the reference genome to generate the valid Hi-C contact matrixes among contig bins (size 100,000 bp) using HiC-Pro version 3.1.0. 23 Then, the contigs >1,000,000 bp were assembled into chromosome-level scaffolds based on the Hi-C linkage information among contig ends, using EndHiC version 1.0 24 in multi-round mode and manual correction of mis-joined scaffolds according to Hi-C heatmap.…”
Section: Methodsmentioning
confidence: 99%
“…De novo assembly of two haplotype genomes based on PacBio long reads and Hi‐C data was performed using hifiasm v0.16.1‐r375 (Cheng et al., 2021) with the default parameters. These two haplotypes were first assembled at the chromosome level using ALLHiC v0.9.13 (Zhang, et al., 2019) and EndHiC v1.0 (Wang et al., 2022), but both were found to be poorly assembled. So that we aligned the two haplotypes derived from hifiasm using Minimap2 v2.17‐r941 (Li, 2018), then extracted the corresponding contigs and sorted into the chromosome‐level haplotype genomes.…”
Section: Methodsmentioning
confidence: 99%
“…The pseudo-chromosomes were constructed using Hi-C sequencing data. We used HiC-Pro v2.11.4 (Servant et al ., 2015) pipeline to perform the Hi-C reads mapping, valid ligation pairs detection, and the Hi-C link matrixes generation, which were further used for pseudo-chromosomes construction by EndHiC v1.0 (Wang et al ., 2022). The completeness of the genome assembly was estimated using BUSCO v5.2.2 (BUSCO) (Simao et al ., 2015) against embryophyta_odb10 database.…”
Section: Methodsmentioning
confidence: 99%