Transcriptome data are widely used for functional analysis of genes. Denovo assembly of transcriptome gives a large number of unigenes. A large proportion of them remain unannotated. Efficient computational methods are required for identifying genes and modeling those for regulatory and functional roles. Principal component analysis (PCA) was used in a novel approach to shortlist genes, independently of annotation in genome expression data, taking seed development in Arabidopsis thaliana as a representative case. PCA was applied to published genome expression data from four lines of Arabidopsis, mutated in seed development.The PC separating all the developmental stages between a mutant and its respective wild type was selected for shortlisting genes as functionally more important. The shortlisted genes identified by PCA belong to a number of biological functions. The genes reported to give sensitivity to desiccation were identified in PCA analysis also in desiccation intolerant lines only. With respect to the network of 98 genes targeted by ABI3, a higher number of genes was identified as important in the mutants abi 3-5, fus 3-3 andlec 1-1 in comparison to abi 3-1. Ontological analysis and comparison with earlier studies suggest that PCA of genome expression data is useful for shortlisting functionally important genes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.