BackgroundMaize is one of the most important crops in the world. With the exponentially increasing population and the need for ever increased food and feed production, an increased yield of maize grain (as well as rice, wheat and other grains) will be critical. Maize grain development is understood from the perspective of morphology, hormone responses, and storage reserve accumulation. This includes various studies on gene expression during embryo development and maturation but a global study of gene expression of the embryo has not been possible until recently. Transcriptome analysis is a powerful new tool that can be used to understand the genetic basis of embryo maturation.ResultsWe undertook a transcriptomic analysis of normal maturing embryos at 15, 21 and 27 days after pollination (DAP), of one elite maize germplasm line that was utilized in crosses to transgenic plants. More than 19,000 genes were analyzed by this method and the challenge was to select subsets of genes that are vitally important to embryo development and maturation for the initial analysis. We describe the changes in expression for genes relating to primary metabolic pathways, DNA synthesis, late embryogenesis proteins and embryo storage proteins, shown through transcriptome analysis and confirmed levels of transcription for some genes in the transcriptome using qRT-PCR.ConclusionsNumerous genes involved in embryo maturation have been identified, many of which show changes in expression level during the progression from 15 to 27 DAP. An expected array of genes involved in primary metabolism was identified. Moreover, more than 30% of transcripts represented un-annotated genes, leaving many functions to be discovered. Of particular interest are the storage protein genes, globulin-1, globulin-2 and an unidentified cupin family gene. When expressing foreign proteins in maize, the globulin-1 promoter is most often used, but this cupin family gene has much higher expression and may be a better candidate for foreign gene expression in maize embryos. Results such as these allow identification of candidate genes and promoters that may not otherwise be available for use. mRNA seq data archived in NCBI SRA; Accession number: ACC=SRA060791 subid=108584.
It is challenging to cluster cancer patients of a certain histopathological type into molecular subtypes of clinical importance and identify gene signatures directly relevant to the subtypes. Current clustering approaches have inherent limitations, which prevent them from gauging the subtle heterogeneity of the molecular subtypes. In this paper we present a new framework: SPARCoC (Sparse-CoClust), which is based on a novel Common-background and Sparse-foreground Decomposition (CSD) model and the Maximum Block Improvement (MBI) co-clustering technique. SPARCoC has clear advantages compared with widely-used alternative approaches: hierarchical clustering (Hclust) and nonnegative matrix factorization (NMF). We apply SPARCoC to the study of lung adenocarcinoma (ADCA), an extremely heterogeneous histological type, and a significant challenge for molecular subtyping. For testing and verification, we use high quality gene expression profiling data of lung ADCA patients, and identify prognostic gene signatures which could cluster patients into subgroups that are significantly different in their overall survival (with p-values < 0.05). Our results are only based on gene expression profiling data analysis, without incorporating any other feature selection or clinical information; we are able to replicate our findings with completely independent datasets. SPARCoC is broadly applicable to large-scale genomic data to empower pattern discovery and cancer gene identification.
BackgroundProtein structure comparison and classification is an effective method for exploring protein structure-function relations. This problem is computationally challenging. Many different computational approaches for protein structure comparison apply the secondary structure elements (SSEs) representation of protein structures.ResultsWe study the complexity of the protein structure comparison problem based on a mixed-graph model with respect to different computational frameworks. We develop an effective approach for protein structure comparison based on a novel independent set enumeration algorithm. Our approach (named: ePC, efficient enumeration-based Protein structure Comparison) is tested for general purpose protein structure comparison as well as for specific protein examples. Compared with other graph-based approaches for protein structure comparison, the theoretical running-time O(1.47rnn2) of our approach ePC is significantly better, where n is the smaller number of SSEs of the two proteins, r is a parameter of small value.ConclusionThrough the enumeration algorithm, our approach can identify different substructures from a list of high-scoring solutions of biological interest. Our approach is flexible to conduct protein structure comparison with the SSEs in sequential and non-sequential order as well. Supplementary data of additional testing and the source of ePC will be available at http://bioinformatics.astate.edu/.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.