Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6–40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources.
Non-coding RNAs from transposable elements of human genome are gaining prominence in modulating transcriptome dynamics. Alu elements, as exonized, edited and antisense components within same transcripts could create novel regulatory switches in response to different transcriptional cues. We provide the first evidence for co-occurrences of these events at transcriptome-wide scale through integrative analysis of data sets across diverse experimental platforms and tissues. This involved the following: (i) positional anchoring of Alu exonization events in the UTRs and CDS of 4663 transcript isoforms from RefSeq mRNAs and (ii) mapping on to them A→I editing events inferred from ∼7 million ESTs from dbEST and antisense transcripts identified from virtual serial analysis of gene expression tags represented in Cancer Genome Anatomy Project next-generation sequencing data sets across 20 tissues. We observed significant enrichment of these events in the 3′UTR as well as positional preference within the embedded Alus. More than 300 genes had co-occurrence of all these events at the exon level and were significantly enriched in apoptosis and lysosomal processes. Further, we demonstrate functional evidence of such dynamic interactions between Alu-mediated events in a time series data from Integrated Personal Omics Profiling during recovery from a viral infection. Such ‘single transcript—multiple fate’ opportunity facilitated by Alu elements may modulate transcriptional response, especially during stress.
Alphonso is known as the “King of mangos” due to its unique flavor, attractive color, low fiber pulp and long shelf life. We analyzed the transcriptome of Alphonso mango through Illumina sequencing from seven stages of fruit development and ripening as well as flower. Total transcriptome data from these stages ranged between 65 and 143 Mb. Importantly, 20,755 unique transcripts were annotated and 4,611 were assigned enzyme commission numbers, which encoded 142 biological pathways. These included ethylene and flavor related secondary metabolite biosynthesis pathways, as well as those involved in metabolism of starch, sucrose, amino acids and fatty acids. Differential regulation (p-value ≤ 0.05) of thousands of transcripts was evident in various stages of fruit development and ripening. Novel transcripts for biosynthesis of mono-terpenes, sesqui-terpenes, di-terpenes, lactones and furanones involved in flavor formation were identified. Large number of transcripts encoding cell wall modifying enzymes was found to be steady in their expression, while few were differentially regulated through these stages. Novel 79 transcripts of inhibitors of cell wall modifying enzymes were simultaneously detected throughout Alphonso fruit development and ripening, suggesting controlled activity of these enzymes involved in fruit softening.
BackgroundAlu RNAs are present at elevated levels in stress conditions and, consequently, Alu repeats are increasingly being associated with the physiological stress response. Alu repeats are known to harbor transcription factor binding sites that modulate RNA pol II transcription and Alu RNAs act as transcriptional co-repressors through pol II binding in the promoter regions of heat shock responsive genes. An observation of a putative heat shock factor (HSF) binding site in Alu led us to explore whether, through HSF binding, these elements could further contribute to the heat shock response repertoire.ResultsAlu density was significantly enriched in transcripts that are down-regulated following heat shock recovery in HeLa cells. ChIP analysis confirmed HSF binding to a consensus motif exhibiting positional conservation across various Alu subfamilies, and reporter constructs demonstrated a sequence-specific two-fold induction of these sites in response to heat shock. These motifs were over-represented in the genic regions of down-regulated transcripts in antisense oriented Alus. Affymetrix Exon arrays detected antisense signals in a significant fraction of the down-regulated transcripts, 50% of which harbored HSF sites within 5 kb. siRNA knockdown of the selected antisense transcripts led to the over-expression, following heat shock, of their corresponding down-regulated transcripts. The antisense transcripts were significantly enriched in processes related to RNA pol III transcription and the TFIIIC complex.ConclusionsWe demonstrate a non-random presence of Alu repeats harboring HSF sites in heat shock responsive transcripts. This presence underlies an antisense-mediated mechanism that represents a novel component of Alu and HSF involvement in the heat shock response.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.