Background With the decreasing cost of sequencing and the rapid developments in genomics technologies and protocols, the need for validated bioinformatics software that enables efficient large-scale data processing is growing. Findings Here we present GenPipes, a flexible Python-based framework that facilitates the development and deployment of multi-step workflows optimized for high-performance computing clusters and the cloud. GenPipes already implements 12 validated and scalable pipelines for various genomics applications, including RNA sequencing, chromatin immunoprecipitation sequencing, DNA sequencing, methylation sequencing, Hi-C, capture Hi-C, metagenomics, and Pacific Biosciences long-read assembly. The software is available under a GPLv3 open source license and is continuously updated to follow recent advances in genomics and bioinformatics. The framework has already been configured on several servers, and a Docker image is also available to facilitate additional installations. Conclusions GenPipes offers genomics researchers a simple method to analyze different types of data, customizable to their needs and resources, as well as the flexibility to create their own workflows.
Clostridium difficile is a common cause of infectious diarrhea in hospitalized patients. A severe and increased incidence of C. difficile infection (CDI) is associated predominantly with the NAP1 strain; however, the existence of other severe-disease-associated (SDA) strains and the extensive genetic diversity across C. difficile complicate reliable detection and diagnosis. Comparative genome analysis of 14 sequenced genomes, including those of a subset of NAP1 isolates, allowed the assessment of genetic diversity within and between strain types to identify DNA markers that are associated with severe disease. Comparative genome analysis of 14 isolates, including five publicly available strains, revealed that C. difficile has a core genome of 3.4 Mb, comprising ϳ3,000 genes. Analysis of the core genome identified candidate DNA markers that were subsequently evaluated using a multistrain panel of 177 isolates, representing more than 50 pulsovars and 8 toxinotypes. A subset of 117 isolates from the panel had associated patient data that allowed assessment of an association between the DNA markers and severe CDI. We identified 20 candidate DNA markers for species-wide detection and 10,683 single nucleotide polymorphisms (SNPs) associated with the predominant SDA strain (NAP1). A species-wide detection candidate marker, the sspA gene, was found to be the same across 177 sequenced isolates and lacked significant similarity to those of other species. Candidate SNPs in genes CD1269 and CD1265 were found to associate more closely with disease severity than currently used diagnostic markers, as they were also present in the toxin A-negative and B-positive (A-B؉) strain types. The genetic markers identified illustrate the potential of comparative genomics for the discovery of diagnostic DNA-based targets that are species specific or associated with multiple SDA strains.
BackgroundZellweger syndrome (ZS) is a peroxisome biogenesis disorder due to mutations in any one of 13 PEX genes. Increased incidence of ZS has been suspected in French-Canadians of the Saguenay-Lac-St-Jean region (SLSJ) of Quebec, but this remains unsolved.MethodsWe identified 5 ZS patients from SLSJ diagnosed by peroxisome dysfunction between 1990–2010 and sequenced all coding exons of known PEX genes in one patient using Next Generation Sequencing (NGS) for diagnostic confirmation.ResultsA homozygous mutation (c.802_815del, p.[Val207_Gln294del, Val76_Gln294del]) in PEX6 was identified and then shown in 4 other patients. Parental heterozygosity was confirmed in all. Incidence of ZS was estimated to 1 in 12,191 live births, with a carrier frequency of 1 in 55. In addition, we present data suggesting that this mutation abolishes a SF2/ASF splice enhancer binding site, resulting in the use of two alternative cryptic donor splice sites and predicted to encode an internally deleted in-frame protein.ConclusionWe report increased incidence of ZS in French-Canadians of SLSJ caused by a PEX6 founder mutation. To our knowledge, this is the highest reported incidence of ZS worldwide. These findings have implications for carrier screening and support the utility of NGS for molecular confirmation of peroxisomal disorders.
Perivascular space (PVS) burden is an emerging, poorly understood, magnetic resonance imaging marker of cerebral small vessel disease, a leading cause of stroke and dementia. Genome-wide association studies in up to 40,095 participants (18 population-based cohorts, 66.3 ± 8.6 yr, 96.9% European ancestry) revealed 24 genome-wide significant PVS risk loci, mainly in the white matter. These were associated with white matter PVS already in young adults (N = 1,748; 22.1 ± 2.3 yr) and were enriched in early-onset leukodystrophy genes and genes expressed in fetal brain endothelial cells, suggesting early-life mechanisms. In total, 53% of white matter PVS risk loci showed nominally significant associations (27% after multiple-testing correction) in a Japanese population-based cohort (N = 2,862; 68.3 ± 5.3 yr). Mendelian randomization supported causal associations of high blood pressure with basal ganglia and hippocampal PVS, and of basal ganglia PVS and hippocampal PVS with stroke, accounting for blood pressure. Our findings provide insight into the biology of PVS and cerebral small vessel disease, pointing to pathways involving extracellular matrix, membrane transport and developmental processes, and the potential for genetically informed prioritization of drug targets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.