The analysis of genomic data (~400,000 autosomal SNPs) enabled the reliable estimation of inbreeding levels in a sample of 541 individuals sampled from a highly admixed Brazilian population isolate (an African-derived quilombo in the State of São Paulo). To achieve this, different methods were applied to the joint information of two sets of markers (one complete and another excluding loci in patent linkage disequilibrium). This strategy allowed the detection and exclusion of markers that biased the estimation of the average population inbreeding coefficient (Wright’s fixation index FIS), which value was eventually estimated as around 1% using any of the methods we applied. Quilombo demographic inferences were made by analyzing the structure of runs of homozygosity (ROH), which were adapted to cope with a highly admixed population with a complex foundation history. Our results suggest that the amount of ROH <2Mb of admixed populations should be somehow proportional to the genetic contribution from each parental population.
Methods to impute HLA alleles based on dense single nucleotide polymorphism (SNP) data provide a valuable resource to association studies and evolutionary investigation of the MHC region. The availability of appropriate training sets is critical to the accuracy of HLA imputation, and the inclusion of samples with various ancestries is an important pre-requisite in studies of admixed populations. We assess the accuracy of HLA imputation using 1000 Genomes Project data as a training set, applying it to a highly admixed Brazilian population, the Quilombos from the state of São Paulo. To assess accuracy, we compared imputed and experimentally determined genotypes for 146 samples at 4 HLA classical loci. We found imputation accuracies of 82.9%, 81.8%, 94.8% and 86.6% for HLA-A, -B, -C and -DRB1 respectively (two-field resolution). Accuracies were improved when we included a subset of Quilombo individuals in the training set. We conclude that the 1000 Genomes data is a valuable resource for construction of training sets due to the diversity of ancestries and the potential for a large overlap of SNPs with the target population. We also show that tailoring training sets to features of the target population substantially enhances imputation accuracy.
Several Mendelian disorders follow an autosomal recessive inheritance pattern. Epidemiological information on many inherited disorders may be useful to guide health policies for rare diseases, but it is often inadequate, particularly in developing countries. We aimed to calculate the carrier frequencies of rare autosomal recessive Mendelian diseases in a cohort of Brazilian patients using whole exome sequencing (WES). We reviewed the molecular findings of WES from 320 symptomatic patients who had carrier status for recessive diseases. Using the Hardy-Weinberg equation, we estimated recessive disease frequencies (q 2 ) considering the respective carrier frequencies (2pq) observed in our study. We calculated the sensitivity of carrier screening tests based on lists of genes from five different clinical laboratories that offer them in Brazil. A total of 425 occurrences of 351 rare variants were reported in 278 different genes from 230 patients (71.9%). Almost half (48.8%) were carriers of at least one heterozygous pathogenic/likely pathogenic variant for rare metabolic disorders, while 25.9% of epilepsy, 18.1% of intellectual disabilities, 15.6% of skeletal disorders, 10.9% immune disorders, and 9.1% of hearing loss. We estimated that an average of 67% of the variants would not have been detected by carrier screening panels. The combined frequencies of autosomal recessive diseases were estimated to be 26.39/10,000 (or $0.26%). This study shows the potential research utility of WES to determine carrier status, which may be a possible strategy to evaluate the clinical and social burden of recessive diseases at the population level and guide the optimization of carrier screening panels.
Next-generation sequencing (NGS) has altered clinical genetic testing by widening the access to molecular diagnosis of genetically determined rare diseases. However, physicians may face difficulties selecting the best diagnostic approach. Our goal is to estimate the rate of possible molecular diagnoses missed by different targeted gene panels using data from a cohort of patients with rare genetic diseases diagnosed with exome sequencing (ES). For this purpose, we simulated a comparison between different targeted gene panels and ES: the list of genes harboring clinically relevant variants from 158 patients was used to estimate the theoretical rate of diagnoses missed by NGS panels from 53 different NGS panels from eight different laboratories. Panels presented a mean rate of missed diagnoses of 64% (range 14%-100%) compared to ES, representing an average predicted sensitivity of 36%. Metabolic abnormalities represented the group with highest mean of missed diagnoses (86%), while seizure represented the group with lowest mean (46%). Focused gene panels are restricted in covering select sets of genes implicated in specific diseases and they may miss molecular diagnoses of rare diseases compared to ES. However, their role in genetic diagnosis remains important especially for well-known genetic diseases with established genetic locus heterogeneity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.