2021
DOI: 10.1101/2021.05.30.446350
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

The genetic and epigenetic landscape of the Arabidopsis centromeres

Abstract: Centromeres attach chromosomes to spindle microtubules during cell division and, despite this conserved role, show paradoxically rapid evolution and are typified by complex repeats. We used ultra-long-read sequencing to generate the Col-CEN Arabidopsis thaliana genome assembly that resolves all five centromeres. The centromeres consist of megabase-scale tandemly repeated satellite arrays, which support high CENH3 occupancy and are densely DNA methylated, with satellite variants private to each chromosome. CENH… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

8
73
1

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 37 publications
(82 citation statements)
references
References 86 publications
8
73
1
Order By: Relevance
“…Arabidopsis thaliana [74,75], Beta vulgaris [76] and Ensete glaucum [26]), but we found no equivalent tandem repeat in ALO. However, often in genomes with centromeric satellite sequences, abundant families of retroelements are also found at the centromeres, such as the Nanica LINE of Musa acuminata [77] and E. glaucum [26], Arabidopsis retroelement domains [67] or the wheat Quinta and other elements [32,78].…”
Section: Discussioncontrasting
confidence: 60%
“…Arabidopsis thaliana [74,75], Beta vulgaris [76] and Ensete glaucum [26]), but we found no equivalent tandem repeat in ALO. However, often in genomes with centromeric satellite sequences, abundant families of retroelements are also found at the centromeres, such as the Nanica LINE of Musa acuminata [77] and E. glaucum [26], Arabidopsis retroelement domains [67] or the wheat Quinta and other elements [32,78].…”
Section: Discussioncontrasting
confidence: 60%
“…53 One successful example for ONT DDS application is to profile centromeric DNA sequence and the related epigenetic traits in Arabidopsis. 125 Centromeric DNA comprises satellite DNA with a 100-to 200-bp repeat sequence. Satellite DNA is highly variable in sequence and length among different species; therefore, it is very challenging to assemble centromeric DNA with traditional sequencing technologies.…”
Section: Ont Dds and Its Applicationsmentioning
confidence: 99%
“…Although the Arabidopsis thaliana genome was sequenced in 2000, until recently the centromeric DNA reference was assembled with ONT DDS ultra-long-reads. 125 It clearly shows that there are 66,129 centromeric satellite repeats with an approximately 180-bp sequence in the five chromosomal centromeric regions, and each chromosome possesses largely private satellite variants, higher-order CEN180 repeats are prevalent within centromeres. The investigators also found that ATHILA LTR retrotransposons interrupted centromeric genetic and epigenetic organization by invading the satellite array in centromeres.…”
Section: Introductionmentioning
confidence: 99%
“…It is very difficult with conventional sequencing technologies to accurately assemble regions with long repeats that maintain high sequence identity among copies. More recent efforts to generate complete Arabidopsis chromosomal assemblies leveraged advances in long-read sequencing technologies (Naish, et al 2021; Wang, et al 2021), including PacBio HiFi, which can produce reads over 15 kb in length with >99% accuracy. These studies were successful in spanning highly repetitive centromere regions, and they both extended the coverage of the Chromosome 2 numt.…”
Section: Introductionmentioning
confidence: 99%
“…However, these assemblies differed in multiple regions of the genome (Rabanal, et al 2022), including major disagreements in the length and nucleotide sequence of this numt. The Col-CEN (Naish, et al 2021) and Col-XJTU (Wang, et al 2021) assemblies reported lengths of 370 kb and 641 kb, respectively, and their alignable regions differed by 109 single-nucleotide variants (SNVs), 18 indels, and one 4-bp microinversion even though they were both derived from Arabidopsis Col-0 ecotypes.…”
Section: Introductionmentioning
confidence: 99%