Darwin's finches, inhabiting the Galápagos archipelago and Cocos Island, constitute an iconic model for studies of speciation and adaptive evolution. Here we report the results of whole-genome re-sequencing of 120 individuals representing all of the Darwin's finch species and two close relatives. Phylogenetic analysis reveals important discrepancies with the phenotype-based taxonomy. We find extensive evidence for interspecific gene flow throughout the radiation. Hybridization has given rise to species of mixed ancestry. A 240 kilobase haplotype encompassing the ALX1 gene that encodes a transcription factor affecting craniofacial development is strongly associated with beak shape diversity across Darwin's finch species as well as within the medium ground finch (Geospiza fortis), a species that has undergone rapid evolution of beak shape in response to environmental changes. The ALX1 haplotype has contributed to diversification of beak shapes among the Darwin's finches and, thereby, to an expanded utilization of food resources.
The domestic dog, Canis familiaris, is a well-established model system for mapping trait and disease loci. While the original draft sequence was of good quality, gaps were abundant particularly in promoter regions of the genome, negatively impacting the annotation and study of candidate genes. Here, we present an improved genome build, canFam3.1, which includes 85 MB of novel sequence and now covers 99.8% of the euchromatic portion of the genome. We also present multiple RNA-Sequencing data sets from 10 different canine tissues to catalog ∼175,000 expressed loci. While about 90% of the coding genes previously annotated by EnsEMBL have measurable expression in at least one sample, the number of transcript isoforms detected by our data expands the EnsEMBL annotations by a factor of four. Syntenic comparison with the human genome revealed an additional ∼3,000 loci that are characterized as protein coding in human and were also expressed in the dog, suggesting that those were previously not annotated in the EnsEMBL canine gene set. In addition to ∼20,700 high-confidence protein coding loci, we found ∼4,600 antisense transcripts overlapping exons of protein coding genes, ∼7,200 intergenic multi-exon transcripts without coding potential, likely candidates for long intergenic non-coding RNAs (lincRNAs) and ∼11,000 transcripts were reported by two different library construction methods but did not fit any of the above categories. Of the lincRNAs, about 6,000 have no annotated orthologs in human or mouse. Functional analysis of two novel transcripts with shRNA in a mouse kidney cell line altered cell morphology and motility. All in all, we provide a much-improved annotation of the canine genome and suggest regulatory functions for several of the novel non-coding transcripts.
Salamanders exhibit an extraordinary ability among vertebrates to regenerate complex body parts. However, scarce genomic resources have limited our understanding of regeneration in adult salamanders. Here, we present the ~20 Gb genome and transcriptome of the Iberian ribbed newt Pleurodeles waltl, a tractable species suitable for laboratory research. We find that embryonic stem cell-specific miRNAs mir-93b and mir-427/430/302, as well as Harbinger DNA transposons carrying the Myb-like proto-oncogene have expanded dramatically in the Pleurodeles waltl genome and are co-expressed during limb regeneration. Moreover, we find that a family of salamander methyltransferases is expressed specifically in adult appendages. Using CRISPR/Cas9 technology to perturb transcription factors, we demonstrate that, unlike the axolotl, Pax3 is present and necessary for development and that contrary to mammals, muscle regeneration is normal without functional Pax7 gene. Our data provide a foundation for comparative genomic studies that generate models for the uneven distribution of regenerative capacities among vertebrates.
SignificanceWe performed de novo, full-genome sequence analysis of two Populus species, North American quaking and Eurasian trembling aspen, that contain striking levels of genetic variation. Our results showed that positive and negative selection broadly affects patterns of genomic variation, but to varying degrees across coding and noncoding regions. The strength of selection and rates of sequence divergence were strongly related to differences in gene expression and coexpression network connectivity. These results highlight the importance of both positive and negative selection in shaping genome-wide levels of genetic variation in an obligately outcrossing, perennial plant. The resources we present establish aspens as a powerful study system enabling future studies for understanding the genomic determinants of adaptive evolution.
BackgroundPhenomena such as incomplete lineage sorting, horizontal gene transfer, gene duplication and subsequent sub- and neo-functionalisation can result in distinct local phylogenetic relationships that are discordant with species phylogeny. In order to assess the possible biological roles for these subdivisions, they must first be identified and characterised, preferably on a large scale and in an automated fashion.ResultsWe developed Saguaro, a combination of a Hidden Markov Model (HMM) and a Self Organising Map (SOM), to characterise local phylogenetic relationships among aligned sequences using cacti, matrices of pair-wise distance measures. While the HMM determines the genomic boundaries from aligned sequences, the SOM hypothesises new cacti in an unsupervised and iterative fashion based on the regions that were modelled least well by existing cacti. After testing the software on simulated data, we demonstrate the utility of Saguaro by testing two different data sets: (i) 181 Dengue virus strains, and (ii) 5 primate genomes. Saguaro identifies regions under lineage-specific constraint for the first set, and genomic segments that we attribute to incomplete lineage sorting in the second dataset. Intriguingly for the primate data, Saguaro also classified an additional ~3% of the genome as most incompatible with the expected species phylogeny. A substantial fraction of these regions was found to overlap genes associated with both the innate and adaptive immune systems.ConclusionsSaguaro detects distinct cacti describing local phylogenetic relationships without requiring any a priori hypotheses. We have successfully demonstrated Saguaro’s utility with two contrasting data sets, one containing many members with short sequences (Dengue viral strains: n = 181, genome size = 10,700 nt), and the other with few members but complex genomes (related primate species: n = 5, genome size = 3 Gb), suggesting that the software is applicable to a wide variety of experimental populations. Saguaro is written in C++, runs on the Linux operating system, and can be downloaded from http://saguarogw.sourceforge.net/.
The aim of this cross-sectional study was to show the characteristics of breast cancer across a period of 15 years according to pathological records in Tehran, Iran. In the year 1985, a 20-year study was designed and developed in five major hospitals in Tehran to study the burden and characteristics of breast cancer in Iran. This study is based on the data collected from 1986 through 2000. SPSS version 13 was used for statistical analysis. In this study, 1612 female breast cancer records were reviewed. The mean age of patients was 47.95+/-12.42 years with a median of 47 years. Over the study period, the proportion of tumors diagnosed at T2 increased with a decline in the proportion of T3 cases. Similarly, the percentage of stage II cases at diagnosis increased, whereas stage III decreased. We detected a decrease in tumor size and downstaging of female breast cancer in Tehran, Iran. This can be attributed to the overall improvement in the level of health in Iran and also educational activities that teach women how to perform breast self-exam and when and why to ask for breast examination.
After performing de novo transcript assembly of >1 billion RNA-Sequencing reads obtained from 22 samples of different Norway spruce (Picea abies) tissues that were not surface sterilized, we found that assembled sequences captured a mix of plant, lichen, and fungal transcripts. The latter were likely expressed by endophytic and epiphytic symbionts, indicating that these organisms were present, alive, and metabolically active. Here, we show that these serendipitously sequenced transcripts need not be considered merely as contamination, as is common, but that they provide insight into the plant’s phyllosphere. Notably, we could classify these transcripts as originating predominantly from Dothideomycetes and Leotiomycetes species, with functional annotation of gene families indicating active growth and metabolism, with particular regards to glucose intake and processing, as well as gene regulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.