Recently, a new class of extrachromosomal circular DNA, called microDNA, was identified. They are on average 100 to 400 bp long and are derived from unique non-repetitive genomic regions with high gene density. MicroDNAs are thought to arise from DNA breaks associated with RNA metabolism or replication slippage. Given the paucity of information on this entirely novel phenomenon, we aimed to get an additional insight into microDNA features by performing the microDNA analysis in 20 independent human lymphoblastoid cell lines (LCLs) prior and after treatment with chemotherapeutic drugs. The results showed non-random genesis of microDNA clusters from the active regions of the genome. The size periodicity of 190 bp was observed, which matches DNA fragmentation typical for apoptotic cells. The chemotherapeutic drug-induced apoptosis of LCLs increased both number and size of clusters further suggesting that part of microDNAs could result from the programmed cell death. Interestingly, proportion of identified microDNA sequences has common loci of origin when compared between cell line experiments. While compatible with the original observation that microDNAs originate from a normal physiological process, obtained results imply complementary source of its production. Furthermore, non-random genesis of microDNAs depicted by redundancy between samples makes these entities possible candidates for new biomarker generation.
BackgroundNext-generation sequencing (NGS) allows unbiased, in-depth interrogation of cancer genomes. Many somatic variant callers have been developed yet accurate ascertainment of somatic variants remains a considerable challenge as evidenced by the varying mutation call rates and low concordance among callers. Statistical model-based algorithms that are currently available perform well under ideal scenarios, such as high sequencing depth, homogeneous tumor samples, high somatic variant allele frequency (VAF), but show limited performance with sub-optimal data such as low-pass whole-exome/genome sequencing data. While the goal of any cancer sequencing project is to identify a relevant, and limited, set of somatic variants for further sequence/functional validation, the inherently complex nature of cancer genomes combined with technical issues directly related to sequencing and alignment can affect either the specificity and/or sensitivity of most callers.ResultsFor these reasons, we developed SNooPer, a versatile machine learning approach that uses Random Forest classification models to accurately call somatic variants in low-depth sequencing data. SNooPer uses a subset of variant positions from the sequencing output for which the class, true variation or sequencing error, is known to train the data-specific model. Here, using a real dataset of 40 childhood acute lymphoblastic leukemia patients, we show how the SNooPer algorithm is not affected by low coverage or low VAFs, and can be used to reduce overall sequencing costs while maintaining high specificity and sensitivity to somatic variant calling. When compared to three benchmarked somatic callers, SNooPer demonstrated the best overall performance.ConclusionsWhile the goal of any cancer sequencing project is to identify a relevant, and limited, set of somatic variants for further sequence/functional validation, the inherently complex nature of cancer genomes combined with technical issues directly related to sequencing and alignment can affect either the specificity and/or sensitivity of most callers. The flexibility of SNooPer’s random forest protects against technical bias and systematic errors, and is appealing in that it does not rely on user-defined parameters. The code and user guide can be downloaded at https://sourceforge.net/projects/snooper/.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-3281-2) contains supplementary material, which is available to authorized users.
Summary Untargeted metabolomics is used to refine the development of biomarkers for the diagnosis of cardiovascular disease. Myocardial infarction (MI) has major individual and societal consequences for patients, who remain at high risk of secondary events, despite advances in pharmacological therapy. To monitor their differential response to treatment, we performed untargeted plasma metabolomics on 175 patients from the platelet inhibition and patient outcomes (PLATO) trial treated with ticagrelor and clopidogrel, two common P 2 Y 12 inhibitors. We identified a signature that discriminates patients, which involves polyunsaturated fatty acids (PUFAs) and particularly the omega-3 fatty acids docosahexaenoate and eicosapentaenoate. The known cardiovascular benefits of PUFAs could contribute to the efficacy of ticagrelor. Our work, beyond pointing out the high relevance of untargeted metabolomics in evaluating response to treatment, establishes PUFA metabolism as a pathway of clinical interest in the recovery path from MI.
Studies combining metabolomics and genetics, known as metabolite genome-wide association studies (mGWAS), have provided valuable insights into our understanding of the genetic control of metabolite levels. However, the biological interpretation of these associations remains challenging due to a lack of existing tools to annotate mGWAS gene-metabolite pairs beyond the use of conservative statistical significance threshold. Here, we computed the shortest reactional distance (SRD) based on the curated knowledge of the KEGG database to explore its utility in enhancing the biological interpretation of results from three independent mGWAS, including a case study on sickle cell disease patients. Results show that, in reported mGWAS pairs, there is an excess of small SRD values and that SRD values and p-values significantly correlate, even beyond the standard conservative thresholds. The added-value of SRD annotation is shown for identification of potential false negative hits, exemplified by the finding of gene-metabolite associations with SRD ≤1 that did not reach standard genome-wide significance cut-off. The wider use of this statistic as an mGWAS annotation would prevent the exclusion of biologically relevant associations and can also identify errors or gaps in current metabolic pathway databases. Our findings highlight the SRD metric as an objective, quantitative and easy-to-compute annotation for gene-metabolite pairs that can be used to integrate statistical evidence to biological networks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.