Plants are the world's most consumed goods. They are of high economic value and bring many health benefits. In most countries in Africa, the supply and quality of food will rise to meet the growing population's increasing demand. Genomics and other biotechnology tools offer the opportunity to improve subsistence crops and medicinal herbs in the continent. Significant advances have been made in plant genomics, which have enhanced our knowledge of the molecular processes underlying both plant quality and yield. The sequencing of complex genomes of African plant species, facilitated by the continuously evolving nextgeneration sequencing technologies and advanced bioinformatics approaches, has provided new opportunities for crop improvement. This review summarizes the achievements of genome sequencing projects of endemic African plants in the last two decades. We also present perspectives and challenges for future plant genomic studies that will accelerate important plant breeding programs for African communities. These challenges include a lack of basic facilities, a lack of sequencing and bioinformatics facilities, and a lack of skills to design genomics studies. However, it is imperative to state that African countries have become key players in the plant genome revolution and genome derived-biotechnology. Therefore, African governments should invest in public plant genomics research and applications, establish bioinformatics platforms and training programs, and stimulate university and industry partnerships to fully deploy plant genomics, particularly in the fields of agriculture and medicine.
Tandem repeats (TRs) represent one of the largest sources of genetic variation in humans and are implicated in a range of phenotypes. Here we present a deep characterization of TR variation based on high coverage whole genome sequencing from 3,550 diverse individuals from the 1000 Genomes Project and H3Africa cohorts. We develop a method, EnsembleTR, to integrate genotypes from four separate methods resulting in high-quality genotypes at more than 1.7 million TR loci. Our catalog reveals novel sequence features influencing TR heterozygosity, identifies population-specific trinucleotide expansions, and finds hundreds of novel eQTL signals. Finally, we generate a phased haplotype panel which can be used to impute most TRs from nearby single nucleotide polymorphisms (SNPs) with high accuracy. Overall, the TR genotypes and reference haplotype panel generated here will serve as valuable resources for future genome-wide and population-wide studies of TRs and their role in human phenotypes.
Genome-wide association studies (GWAS) provide huge information on statistically significant single-nucleotide polymorphisms (SNPs) associated with various human complex traits and diseases. By performing GWAS studies, scientists have successfully identified the association of hundreds of thousands to millions of SNPs to a single phenotype. Moreover, the association of some SNPs with rare diseases has been intensively tested. However, classic GWAS studies have not yet provided solid, knowledgeable insight into functional and biological mechanisms underlying phenotypes or mechanisms of diseases. Therefore, several post-GWAS (pGWAS) methods have been recommended. Currently, there is no simple scientific document to provide a quick guide for performing pGWAS analysis. pGWAS is a crucial step for a better understanding of the biological machinery beyond the SNPs. Here, we provide an overview to performing pGWAS analysis and demonstrate the challenges behind each method. Furthermore, we direct readers to key articles for each pGWAS method and present the overall issues in pGWAS analysis. Finally, we include a custom pGWAS pipeline to guide new users when performing their research.
Polygenic Risk Score (PRS) analysis is a method that predicts the genetic risk of an individual towards targeted traits. Even when there are no significant markers, it gives evidence of a genetic effect beyond the results of Genome-Wide Association Studies (GWAS). Moreover, it selects single nucleotide polymorphisms (SNPs) that contribute to the disease with low effect size making it more precise at individual level risk prediction. PRS analysis addresses the shortfall of GWAS by taking into account the SNPs/alleles with low effect size but play an indispensable role to the observed phenotypic/trait variance. PRS analysis has applications that investigate the genetic basis of several traits, which includes rare diseases. However, the accuracy of PRS analysis depends on the genomic data of the underlying population. For instance, several studies show that obtaining higher prediction power of PRS analysis is challenging for non-Europeans. In this manuscript, we review the conventional PRS methods and their application to sub-Saharan African communities. We conclude that lack of sufficient GWAS data and tools is the limiting factor of applying PRS analysis to sub-Saharan populations. We recommend developing Africa-specific PRS methods and tools for estimating and analyzing African population data for clinical evaluation of PRSs of interest and predicting rare diseases.
Polygenic Risk Score (PRS) analysis is a method that predicts the genetic risk of an individual towards targeted traits. Even when there are no significant markers, it gives evidence of a genetic effect beyond the results of Genome-Wide Association Studies (GWAS). Moreover, it selects single nucleotide polymorphisms (SNPs) that contribute to the disease with low effect size making it more precise at individual level risk prediction. PRS analysis addresses the shortfall of GWAS by taking into account the SNPs/alleles with low effect size but play an indispensable role to the observed phenotypic/trait variance. PRS analysis has applications that investigate the genetic basis of several traits, which includes rare diseases. However, the accuracy of PRS analysis depends on the genomic data of the underlying population. For instance, several studies show that obtaining higher prediction power of PRS analysis is challenging for non-Europeans. In this manuscript, we review the conventional PRS methods and their application to sub-Saharan African communities. We conclude that lack of sufficient GWAS data and tools is the limiting factor of applying PRS analysis to sub-Saharan populations. We recommend developing Africa-specific PRS methods and tools for estimating and analyzing African population data for clinical evaluation of PRSs of interest and predicting rare diseases.
Abstract. Long non-coding RNAs (lncRNAs) play crucial roles in diverse biological processes. For example, they help to regulate gene expression and to shape the 3D organization of the cellular nucleus. However, their functions in these processes are not well known. Most of the ongoing research efforts regarding lncRNAs have focused on predicting their properties and their functions. Here, we aimed to extract the biological functions of lncRNAs referred in the biomedical literature. To this end, we performed gene enrichment tests for 417 verified gene names extracted from the scientific literature. We also succeeded in extracting biologically significant functions from 2455 single sentences of 2513 scientific abstracts retrieved from the PubMed digital library. These sentences had to be filtered to eliminate those that did not mention a biological function. They were also split into 2-3-grams and the meaningful function annotations based on Gene Ontology terms were extracted. The results provided by the gene enrichment test and the knowledge gained by automatic extraction of information from the scientific literature showed that lncRNAs play a critical role in all biological and cellular processes. Although simple natural language processing techniques fulfilled our purposes of extracting concepts concerning the biological functions of lncRNAs, in the future, we aim to use more sophisticated methods, such as grammatical rules, to extract more functions and to generate a full database for the biological functions related to lncRNAs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.