Linking genomic variation to phenotypical traits remains a major challenge in evolutionary genetics. In this study, we use phylogenomic strategies to investigate a distinctive trait among mammals: the development of masculinizing ovotestes in female moles. By combining a chromosome-scale genome assembly of the Iberian mole, Talpa occidentalis, with transcriptomic, epigenetic, and chromatin interaction datasets, we identify rearrangements altering the regulatory landscape of genes with distinct gonadal expression patterns. These include a tandem triplication involving CYP17A1, a gene controlling androgen synthesis, and an intrachromosomal inversion involving the pro-testicular growth factor gene FGF9, which is heterochronically expressed in mole ovotestes. Transgenic mice with a knock-in mole CYP17A1 enhancer or overexpressing FGF9 showed phenotypes recapitulating mole sexual features. Our results highlight how integrative genomic approaches can reveal the phenotypic impact of noncoding sequence changes.
In recent years, hundreds of novel RNA-binding proteins (RBPs) have been identified, leading to the discovery of novel RNA-binding domains. Furthermore, unstructured or disordered low-complexity regions of RBPs have been identified to play an important role in interactions with nucleic acids. However, these advances in understanding RBPs are limited mainly to eukaryotic species and we only have limited tools to faithfully predict RNA-binders in bacteria. Here, we describe a support vector machine-based method, called TriPepSVM, for the prediction of RNA-binding proteins. TriPepSVM applies string kernels to directly handle protein sequences using tri-peptide frequencies. Testing the method in human and bacteria, we find that several RBP-enriched tri-peptides occur more often in structurally disordered regions of RBPs. TriPepSVM outperforms existing applications, which consider classical structural features of RNA-binding or homology, in the task of RBP prediction in both human and bacteria. Finally, we predict 66 novel RBPs in
Salmonella
Typhimurium and validate the bacterial proteins ClpX, DnaJ and UbiG to associate with RNA
in vivo
.
Exome sequencing has introduced a paradigm shift for the identification of germline variations responsible for Mendelian diseases. However, non‐coding regions, which make up 98% of the genome, cannot be captured. The lack of functional annotation for intronic and intergenic variants makes RNA‐seq a powerful companion diagnostic. Here, we illustrate this point by identifying six patients with a recessive Osteogenesis Imperfecta (OI) and neonatal progeria syndrome. By integrating homozygosity mapping and RNA‐seq, we delineated a deep intronic TAPT1 mutation (c.1237‐52 G>A) that segregated with the disease. Using SI‐NET‐seq, we document that TAPT1's nascent transcription was not affected in patients' fibroblasts, indicating instead that this variant leads to an alteration of pre‐mRNA processing. Predicted to serve as an alternative splicing branchpoint, this mutation enhances TAPT1 exon 12 skipping, creating a protein‐null allele. Additionally, our study reveals dysregulation of pathways involved in collagen and extracellular matrix biology in disease‐relevant cells. Overall, our work highlights the power of transcriptomic approaches in deciphering the repercussions of non‐coding variants, as well as in illuminating the molecular mechanisms of human diseases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.