Meiotic crossovers (COs) ensure proper chromosome segregation and redistribute the genetic variation that is transmitted to the next generation. Existing methods for CO identification are challenged by large populations and the demand for genome-wide and fine-scale resolution. Taking advantage of linked-read sequencing, we developed a highly efficient method for genome-wide identification of COs at kilobase resolution in pooled recombinants. We first tested this method using a pool of Arabidopsis F 2 recombinants, and obtained results that recapitulated those identified from the same plants using individual whole-genome sequencing. By applying this method to a pool of pollen DNA from a single F 1 plant, we established a highly accurate CO landscape without generating or sequencing a single recombinant plant. The simplicity of this approach now enables the simultaneous generation and analysis of multiple CO landscapes and thereby allows for efficient comparison of genotypic and environmental effects on recombination, accelerating the pace at which the mechanisms for the regulation of recombination can be elucidated.
CO detection using linked-read sequencing
RESULTS
CO breakpoint detection from bulk recombinantsTo establish a set of COs for verifying our method, we first performed whole-genome sequencing of 50 individual F 2 plants derived from a cross of two of the best-studied inbred lab strains of Arabidopsis, Col-0 and Ler and used the haplotype reconstruction software TIGER 33 to determine a benchmark set of 400 COs across all 50 genomes (Fig. 1
; Materials and Methods;Supplementary Tables 1 and 2).We then bulked the identical 50 F 2 plants by pooling individual leaves of comparable size and extracting high molecular weight (HMW) DNA 37 (Fig. 1). After size selection and quality control ( Supplementary Fig. 1), we loaded 0.25 ng DNA into a 10X Genomics Chromium Controller. The Chromium Controller encapsulates millions of gel beads as GEMs (Gel bead in EMulsion), each of which is loaded with a small number of long DNA molecules. These long molecules are fragmented and ligated with GEM-specific DNA barcodes to generate a 10X library suitable for Illumina sequencing. This library, which we called P50L25, was whole-genome sequenced with 84 million 151 bp-read pairs ( Supplementary Table 1).CO detection using linked-read sequencing 5 After aligning the reads against the Col-0 reference sequence 38 using longranger (v2.2.2, 10X Genomics), we recovered 3.6 million molecules (≥1 kb) including 116 million reads using a newly developed computational tool, DrLink, which can be downloaded at https://github.com/schneebergerlab/DrLink (Fig. 2a; Materials and Methods). On average, these molecules identified by DrLink were ~45 kb in size and were covered by ~21 read pairs, leading to a molecule base coverage of ~0.16x ( Fig. 2b-d; Supplementary Table 1). To avoid chimeras resulting from the accidental co-occurrence of two independent, but closely-spaced molecules with identical barcodes, we selected molecules which were small...