Background
Copy Number Variation (CNV) is a common form of genetic variation underlying animal evolution and phenotypic diversity across a wide range of species. In the mammalian genome, high frequency of CNV differentiation between breeds may be candidates for population-specific selection. However, CNV differentiation, selection and its population genetics have been poorly explored in horses.
Results
We investigated the patterns, population variation and gene annotation of CNV using the Axiom® Equine Genotyping Array (670,796 SNPs) from a large cohort of individuals (N=1,755) belonging to eight European horse breeds, varying from draught horses to several warmblood populations. After quality control, 152,640 SNP CNVs (individual markers), 18,800 segment CNVs (consecutive SNP CNVs of same gain/loss state or both) and 939 CNV regions (CNVRs; overlapping segment CNVs by at least 1bp) compared to the average signal of the reference were identified. Our analyses showed that Equus caballus chromosome 12 (ECA12) was the most enriched in segment CNV gains and losses (~3% average proportion of the genome covered), but the highest number of segment CNVs were detected on ECA1 and ECA20 (regardless of size). The Friesian horses showed high percentage of unique SNP CNV gains (>20% of the samples) on ECA1 and Exmoor ponies displayed high percentage of unique SNP CNV losses on ECA25 (>20% of the samples). The length of the CNVRs ranged from 1 kb to 21.3 Mb. A total of 10,612 genes were annotated within the CNVRs. The panther annotation of these genes showed significantly under- and overrepresented gene ontology biological terms related to cellular processes and immunity (Bonferroni P-value < 0.05). We identified 80 CNVRs overlapping with known QTL for fertility, coat colour, conformation and temperament. We also report 67 novel CNVRs which contribute to the catalogue of known CNVs in the horse genome.
Conclusions
This work revealed that CNV patterns, in the genome of some European horse breeds, occurred in specific genomic regions and were enriched on ECA1, 7, 9 and 25. The results provide support to the hypothesis that high frequency breed-specific CNVs residing in genes may potentially be responsible for the diverse phenotypes seen between horse breeds.