Here we present GRIDSS2, a general purpose structural variant caller optimised for tumour/normal somatic calling. Using cell line, patient sample validation and cohort-level comparisons, we show GRIDSS2 outperforms recent state-of-the-art tools. We demonstrate GRIDSS2 retains high sensitivity and precision even for small events by identifying a small (32-100bp) duplication signature strongly associated with colorectal cancer using 3,782 metastatic cancers that have been deeply sequenced by the Hartwig Medical Foundation. Essential to the high precision achieved by GRIDSS2 is the novel reporting of single breakend variants: structural variants in which only one side can be unambiguously determined. We show that the inclusion of single breakends reduces the false negative rate from 10.4% to 3.4%. Demonstrating the power single breakend calling has in genomic regions traditionally considered inaccessible to short read callers, we find that 47% of somatic centromeric breaks are repaired to non-centromeric sequence, with chromosome 1 exhibiting a unique centromeric rearrangement signature. Finally, we show that somatic structural variants are highly clustered with GRIDSS2 able to phase 16% of somatic structural variants in the Hartwig cohort from short read sequencing alone.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.