The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse breeds using a newly developed high-density genotyping array consisting of >170,000 evenly spaced SNPs. We first identify 44 genomic regions exhibiting extreme differentiation across multiple breeds. Genetic variation in these regions correlates with variation in several phenotypic traits that vary between breeds, and we identify novel associations with both morphological and behavioral traits. We next scan the genome for signatures of selective sweeps in single breeds, characterized by long regions of reduced heterozygosity and fixation of extended haplotypes. These scans identify hundreds of regions, including 22 blocks of homozygosity longer than one megabase in certain breeds. Candidate selection loci are strongly enriched for developmental genes. We chose one highly differentiated region, associated with body size and ear morphology, and characterized it using high-throughput sequencing to provide a list of variants that may directly affect these traits. This study provides a catalogue of genomic regions showing extreme reduction in genetic variation or population differentiation in dogs, including many linked to phenotypic variation. The many blocks of reduced haplotype diversity observed across the genome in dog breeds are the result of both selection and genetic drift, but extended blocks of homozygosity on a megabase scale appear to be best explained by selection. Further elucidation of the variants under selection will help to uncover the genetic basis of complex traits and disease.
Whole transcriptome sequencing (RNA-seq) has become a standard for cataloguing and monitoring RNA populations. One of the main bottlenecks, however, is to correctly identify the different classes of RNAs among the plethora of reconstructed transcripts, particularly those that will be translated (mRNAs) from the class of long non-coding RNAs (lncRNAs). Here, we present FEELnc (FlExible Extraction of LncRNAs), an alignment-free program that accurately annotates lncRNAs based on a Random Forest model trained with general features such as multi k-mer frequencies and relaxed open reading frames. Benchmarking versus five state-of-the-art tools shows that FEELnc achieves similar or better classification performance on GENCODE and NONCODE data sets. The program also provides specific modules that enable the user to fine-tune classification accuracy, to formalize the annotation of lncRNA classes and to identify lncRNAs even in the absence of a training set of non-coding RNAs. We used FEELnc on a real data set comprising 20 canine RNA-seq samples produced by the European LUPA consortium to substantially expand the canine genome annotation to include 10 374 novel lncRNAs and 58 640 mRNA transcripts. FEELnc moves beyond conventional coding potential classifiers by providing a standardized and complete solution for annotating lncRNAs and is freely available at https://github.com/tderrien/FEELnc.
Rett syndrome is caused by mutations in the gene MECP2 in ∼80% of affected individuals. We describe a previously unknown MeCP2 isoform. Mutations unique to this isoform and the absence, until now, of identified mutations specific to the previously recognized protein indicate an important role for the newly discovered molecule in the pathogenesis of Rett syndrome.Rett syndrome (RTT; OMIM 312750) is characterized by onset, in girls, of a gradual slowing of neurodevelopment in the second half of the first year of life that proceeds towards stagnation by age 4 years, followed by regression and loss of acquired fine motor and communication skills. A pseudostationary period follows during which a picture of preserved ambulation, aberrant communication and stereotypic hand wringing approximates early autism. Regression, however, remains insidiously ongoing and ultimately results in profound mental retardation 1 .Up to 80% of individuals with RTT have mutations 2,3 in exons 3 and 4 of the four-exon gene MECP2 (Fig. 1a) 4 encoding the transcriptional repressor MeCP2. In the known transcript of the gene, all four exons are used, the translation start site is in exon 2, and exon 1 and most of exon 2 form the 5′ untranslated region (UTR) 4 . For clarity, we refer to this transcript as MECP2A and its encoded protein as MeCP2A. We sought to identify MECP2 splice variants contributing new coding sequence that might contain mutations in the remaining individuals with RTT. Inspection of the 5′ UTR showed that, whereas exon 2 has a number of in-frame stop codons upstream of the ATG start codon, exon 1 contains an open reading frame across its entire length, including an ATG. Submitting a theoretical construct composed of exons 1, 3 and 4 to the ATGpr program (http://www.hri.co.jp/atgpr/), which predicts the likelihood that an ATG will be an initiation codon based on the significance of its surrounding Kozak nucleotide context, returned a reliability score of 97%, as compared with 64% for MECP2A. A search in EST databases identified eight examples of our theorized transcript, which we named MECP2B (Fig. 1b), as compared with 14 examples of MECP2A. MECP2B is predicted to encode a new isoform, MeCP2B, with an alternative, longer N terminus determined by exon 1 (see Supplementary Table 1 online).To confirm that MECP2B is expressed and not merely an artifact of cDNA library preparation, we amplified cDNA by PCR from a variety of tissues using a 5′ primer in exon 1 and a 3′ primer in exon 3 (Fig. 1a). We obtained two PCR products corresponding in size and sequence to MECP2A and MECP2B in all tissues, including fetal and adult brain and different brain subregions (Fig. 1c). Results in mouse were similar (Fig. 1c). We quantified the expression levels of the two transcripts in adult human brain. Expression of MECP2B was ten times higher than that of MECP2A (Fig. 1d). We studied the subcellular localization of MeCP2B after transfection of 3′ myc-tagged MECP2B into COS-7 cells and found it to be principally in the nucleus (Fig. 1e).To deter...
A second distinct family of anion exchangers, SLC26, in addition to the classical SLC4 (or anion exchanger) family, has recently been delineated. Particular interest in this gene family is stimulated by the fact that the SLC26A2, SLC26A3, and SLC26A4 genes have been recognized as the disease genes mutated in diastrophic dysplasia, congenital chloride diarrhea, and Pendred syndrome, respectively. We report the expansion of the SLC26 gene family by characterizing three novel tissuespecific members, named SLC26A7, SLC26A8, and SLC26A9, on chromosomes 8, 6, and 1, respectively. The SLC26A7-A9 proteins are structurally very similar at the amino acid level to the previous family members and show tissue-specific expression in kidney, testis, and lung, respectively. More detailed characterization by immunohistochemistry and/or in situ hybridization localized SLC26A7 to distal segments of nephrons, SLC26A8 to developing spermatocytes, and SLC26A9 to the lumenal side of the bronchiolar and alveolar epithelium of lung. Expression of SLC26A7-A9 proteins in Xenopus oocytes demonstrated chloride, sulfate, and oxalate transport activity, suggesting that they encode functional anion exchangers. The functional characterization of the novel tissue-specific members may provide new insights to anion transport physiology in different parts of body.The systematic characterization of gene families using full genome sequences provides a rich source for expanding our physiological understanding of body functions. Recently, a second distinct family of anion exchangers, SLC26, has been delineated. The members of the SLC26 1 family are structurally well conserved across different species and can mediate the electroneutral exchange of Cl Ϫ for HCO 3 Ϫ across the plasma membrane of mammalian cells like members of the classical SLC4 (anion exchanger) family (1-3). Specific interest in the SLC26 gene family is stimulated by the fact that the first three human genes are associated with phenotypically distinct recessive diseases. The SLC26A2, SLC26A3, and SLC26A4 genes have been recognized as disease genes mutated in diastrophic dysplasia, congenital chloride diarrhea, and Pendred syndrome, respectively (4 -6). Thus, the three closely related but highly tissue-specific human anion transporters play central roles in the etiology of phenotypically very different recessive diseases.In human, six tissue-specific genes of the SLC26 family have been cloned so far, namely SLC26A1-A4 (previously known as SAT-1, DTDST, CLD or DRA, and PDS, respectively), SLC26A6, and TAT1. The SLC26A2-A4 members have been shown to transport, with different specificities, the chloride, iodide, bicarbonate, oxalate, and hydroxyl anions (7-12). SLC26A5 has been cloned from gerbil and rat and shown to act as a motor protein of cochlear outer hair cell; it is sensitive to intracellular anions but has not been found to act as a transporter (13,14). The SLC26A6 protein is expressed at highest levels in the kidney and the pancreas and suggested SLC26A6 as a candidate for a yet u...
Horses were domesticated from the Eurasian steppes 5,000–6,000 years ago. Since then, the use of horses for transportation, warfare, and agriculture, as well as selection for desired traits and fitness, has resulted in diverse populations distributed across the world, many of which have become or are in the process of becoming formally organized into closed, breeding populations (breeds). This report describes the use of a genome-wide set of autosomal SNPs and 814 horses from 36 breeds to provide the first detailed description of equine breed diversity. FST calculations, parsimony, and distance analysis demonstrated relationships among the breeds that largely reflect geographic origins and known breed histories. Low levels of population divergence were observed between breeds that are relatively early on in the process of breed development, and between those with high levels of within-breed diversity, whether due to large population size, ongoing outcrossing, or large within-breed phenotypic diversity. Populations with low within-breed diversity included those which have experienced population bottlenecks, have been under intense selective pressure, or are closed populations with long breed histories. These results provide new insights into the relationships among and the diversity within breeds of horses. In addition these results will facilitate future genome-wide association studies and investigations into genomic targets of selection.
The origin and evolution of the domestic dog remains a controversial question for the scientific community, with basic aspects such as the place and date of origin, and the number of times dogs were domesticated, open to dispute. Using whole genome sequences from a total of 58 canids (12 gray wolves, 27 primitive dogs from Asia and Africa, and a collection of 19 diverse breeds from across the world), we find that dogs from southern East Asia have significantly higher genetic diversity compared to other populations, and are the most basal group relating to gray wolves, indicating an ancient origin of domestic dogs in southern East Asia 33 000 years ago. Around 15 000 years ago, a subset of ancestral dogs started migrating to the Middle East, Africa and Europe, arriving in Europe at about 10 000 years ago. One of the out of Asia lineages also migrated back to the east, creating a series of admixed populations with the endemic Asian lineages in northern China before migrating to the New World. For the first time, our study unravels an extraordinary journey that the domestic dog has traveled on earth.
Intense selective pressures applied over short evolutionary time have resulted in homogeneity within, but substantial variation among, horse breeds. Utilizing this population structure, 744 individuals from 33 breeds, and a 54,000 SNP genotyping array, breed-specific targets of selection were identified using an FST-based statistic calculated in 500-kb windows across the genome. A 5.5-Mb region of ECA18, in which the myostatin (MSTN) gene was centered, contained the highest signature of selection in both the Paint and Quarter Horse. Gene sequencing and histological analysis of gluteal muscle biopsies showed a promoter variant and intronic SNP of MSTN were each significantly associated with higher Type 2B and lower Type 1 muscle fiber proportions in the Quarter Horse, demonstrating a functional consequence of selection at this locus. Signatures of selection on ECA23 in all gaited breeds in the sample led to the identification of a shared, 186-kb haplotype including two doublesex related mab transcription factor genes (DMRT2 and 3). The recent identification of a DMRT3 mutation within this haplotype, which appears necessary for the ability to perform alternative gaits, provides further evidence for selection at this locus. Finally, putative loci for the determination of size were identified in the draft breeds and the Miniature horse on ECA11, as well as when signatures of selection surrounding candidate genes at other loci were examined. This work provides further evidence of the importance of MSTN in racing breeds, provides strong evidence for selection upon gait and size, and illustrates the potential for population-based techniques to find genomic regions driving important phenotypes in the modern horse.
The unusually low 78% amino acid identity between the orthologous human SLC26A6 and mouse slc26a6 polypeptides prompted systematic comparison of their anion transport functions in Xenopus oocytes. Multiple human SLC26A6 variant polypeptides were also functionally compared.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.