Accurate identification of genetic variants from next-generation sequencing (NGS) data is essential for immediate largescale genomic endeavors such as the 1000 Genomes Project, and is crucial for further genetic analysis based on the discoveries. The key challenge in single nucleotide polymorphism (SNP) discovery is to distinguish true individual variants (occurring at a low frequency) from sequencing errors (often occurring at frequencies orders of magnitude higher). Therefore, knowledge of the error probabilities of base calls is essential. We have developed Atlas-SNP2, a computational tool that detects and accounts for systematic sequencing errors caused by context-related variables in a logistic regression model learned from training data sets. Subsequently, it estimates the posterior error probability for each substitution through a Bayesian formula that integrates prior knowledge of the overall sequencing error probability and the estimated SNP rate with the results from the logistic regression model for the given substitutions. The estimated posterior SNP probability can be used to distinguish true SNPs from sequencing errors. Validation results show that Atlas-SNP2 achieves a false-positive rate of lower than 10%, with an~5% or lower false-negative rate.[Supplemental material is available online at http://www.genome.org. Atlas-SNP2 and its documentation are available for download at http://www.hgsc.bcm.tmc.edu/cascade-tech-software-ti.hgsc.]In recent years, next-generation sequencing (NGS) technologies have propelled the rapid progress of genomics studies (Hillier et al. 2008;Srivatsan et al. 2008). Continuous improvement in NGS technologies are increasing the throughput while lowering costs, thus enabling ultra-large-scale sequencing efforts (Margulies et al. 2005;Shendure and Ji 2008). For example, the 1000 Genomes Project is aimed at sequencing more than 1000 human genomes to characterize the pattern of genetic variants (common and rare) in unprecedented detail (http://www.1000genomes.org/page.php) (Kaiser 2008). To realize this objective, it is essential that NGS technologies detect genomic variations accurately, including single nucleotide polymorphisms (SNPs), structural variations caused by insertions or deletions (indels), copy number variations (CNVs), and inversions or other rearrangements. However, the short read length and relatively high error rates present challenges to variant discovery from raw NGS data. While the error model for Sanger sequencing was well characterized (Ewing and Green 1998), systematic errors in NGS are not yet well studied, making it difficult to distinguish true genetic variations from the sequencing errors.Currently, there are several methods available for detecting SNPs from NGS data, including Pyrobayes , POLYBAYES (Marth et al. 1999), MAQ (Li et al. 2008), SOAP (Li et al. 2009), VarScan (Ley et al. 2008Koboldt et al. 2009), and other largely heuristic approaches (Wheeler et al. 2008). Pyrobayes-POLYBAYES recalibrates base-calling of all nucleotide positions from ...
Massively parallel sequencing of millions of <30-nt RNAs expressed in mouse ovary, embryonic pancreas (E14.5), and insulin-secreting beta-cells (TC-3) reveals that ∼50% of the mature miRNAs representing mostly the mmu-let-7 family display internal insertion/deletions and substitutions when compared to precursor miRNA and the mouse genome reference sequences. Approximately, 12%-20% of species associated with mmu-let-7 populations exhibit sequence discrepancies that are dramatically reduced in nucleotides 3-7 (5Ј-seed) and 10-15 (cleavage and anchor sites). This observation is inconsistent with sequencing error and leads us to propose that the changes arise predominantly from post-transcriptional RNA-editing activity operating on miRNA:target mRNA complexes. Internal nucleotide modifications are most enriched at the ninth nucleotide position. A common ninth base edit of U-to-G results in a significant increase in stability of down-regulated let-7a targets in inhibin-deficient mice (Inha −/−). An excess of U-insertions (14.8%) over U-deletions (1.5%) and the presence of cleaved intermediates suggest that a mammalian TUTase (terminal uridylyl transferase) mediated dUTP-dependent U-insertion/U-deletion cycle may be a possible mechanism. We speculate that mRNA target site-directed editing of mmu-let-7a duplex-bulges stabilizes "loose" miRNA:mRNA target associations and functions to expand the target repertoire and/or enhance mRNA decay over translational repression. Our results also demonstrate that the systematic study of sequence variation within specific RNA classes in a given cell type from millions of sequences generated by next-generation sequencing (NGS) technologies ("intranomics") can be used broadly to infer functional constraints on specific parts of completely uncharacterized RNAs.
BackgroundMicroRNAs (miRNAs: a class of short non-coding RNAs) are emerging as important agents of post transcriptional gene regulation and integral components of gene networks. MiRNAs have been strongly linked to stem cells, which have a remarkable dual role in development. They can either continuously replenish themselves (self-renewal), or differentiate into cells that execute a limited number of specific actions (pluripotence).Methodology/Principal FindingsIn order to identify novel miRNAs from narrow windows of development we carried out an in silico search for micro-conserved elements (MCE) in adult tissue progenitor transcript sequences. A plethora of previously unknown miRNA candidates were revealed including 545 small RNAs that are enriched in embryonic stem (ES) cells over adult cells. Approximately 20% of these novel candidates are down-regulated in ES (Dicer −/−) ES cells that are impaired in miRNA maturation. The ES-enriched miRNA candidates exhibit distinct and opposite expression trends from mmu-mirs (an abundant class in adult tissues) during retinoic acid (RA)-induced ES cell differentiation. Significant perturbation of trends is found in both miRNAs and novel candidates in ES (GCNF −/−) cells, which display loss of repression of pluripotence genes upon differentiation.Conclusion/SignificanceCombining expression profile information with miRNA target prediction, we identified miRNA-mRNA pairs that correlate with ES cell pluripotence and differentiation. Perturbation of these pairs in the ES (GCNF −/−) mutant suggests a role for miRNAs in the core regulatory networks underlying ES cell self-renewal, pluripotence and differentiation.
The opioid epidemic continues in the United States. Many have been impacted by this epidemic, including neonates who exhibit Neonatal Abstinence Syndrome (NAS). Opioid diagnosis and NAS can be negatively impacted by limited testing options outside the hospital, due to poor assay performance, false-negatives, rapid drug clearance rates, and difficulty in obtaining enough specimen for testing. Here we report a small volume urine assay for oxycodone, hydrocodone, fentanyl, noroxycodone, norhydrocodone, and norfentanyl with excellent LODs and LOQs. The free-solution assay (FSA), coupled with high affinity DNA aptamer probes and a compensated interferometric reader (CIR), represents a potential solution for quantifying opioids rapidly, at high sensitivity, and noninvasively on small sample volumes. The mix-and-read test is 5-to 275-fold and 50-to 1250-fold more sensitive than LC-MS/MS and immunoassays, respectively. Using FSA, oxycodone, hydrocodone, fentanyl, and their urinary metabolites were quantified using 10 μL of urine at 28−81 pg/mL, with >95% specificity and excellent accuracy in ∼1 h. The assay sensitivity, small sample size requirement, and speed could enable opioid screening, particularly for neonates, and points to the potential for pharmacokinetic tracking.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.