As whole-genome sequencing (WGS) becomes the gold standard tool for studying population genomics and medical applications, data on diverse non-European and admixed individuals are still scarce. Here, we present a high-coverage WGS dataset of 1,171 highly admixed elderly Brazilians from a census-based cohort, providing over 76 million variants, of which ~2 million are absent from large public databases. WGS enables identification of ~2,000 previously undescribed mobile element insertions without previous description, nearly 5 Mb of genomic segments absent from the human genome reference, and over 140 alleles from HLA genes absent from public resources. We reclassify and curate pathogenicity assertions for nearly four hundred variants in genes associated with dominantly-inherited Mendelian disorders and calculate the incidence for selected recessive disorders, demonstrating the clinical usefulness of the present study. Finally, we observe that whole-genome and HLA imputation could be significantly improved compared to available datasets since rare variation represents the largest proportion of input from WGS. These results demonstrate that even smaller sample sizes of underrepresented populations bring relevant data for genomic studies, especially when exploring analyses allowed only by WGS.
The pandemic generated by SARS‐Cov‐2 has caused a large number of cases and deaths in the world, but South America has been one of the continents that were most hard hit. The appearance of new variants causes concern because of the possibility that they may evade the protection generated by vaccination campaigns, their greater capacity to be transmitted, or their higher virulence. We analyzed the circulating variants in Peru after improving our Genomic Surveillance program. The results indicate a steep increase of the lambda lineage (C.37) until becoming predominant between January and April 2021, despite the cocirculation of other variants of concern or interest. Lambda lineage deserves close monitoring and could probably become a variant of concern in the near future.
The dissemination of cases of the new SAR-COV-2 coronavirus represents a serious public health problem for Latin America and Peru. For this reason, it is important to characterize the genome of the isolates that circulate in Latin America. To characterize the complete genome of first samples of the virus circulating in Peru, we amplified seven overlapping segments of the viral genome by RT-PCR and sequenced using Miseq platform. The results indicate that the genomes of the Peruvian SARS-COV-2 samples belong to the genetic groups G and S. Likewise, a phylogenetic and MST analysis of the isolates confirm the introduction of multiple isolates from Europe and Asia that, after border closing, were transmitted locally in the capital and same regions of the country. These Peruvian samples (56%) grouped into two clusters inside G clade and share B.1.1.1 lineage. The characterization of these isolates must be considered for the use and design of diagnostic tools, and effective treatment and vaccine formulations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.