In this article we (1) outline the construction of a 3-D "graphical" representation of DNA primary sequences, illustrated on a portion of the human beta globin gene; (2) describe a particular scheme that transforms the above 3-D spatial representation of DNA into a numerical matrix representation; (3) illustrate construction of matrix invariants for DNA sequences; and (4) suggest a data reduction based on statistical analysis of matrix invariants generated for DNA. Each of the four contributions represents a novel development that we hope will facilitate comparative studies of DNA and open new directions for representation and characterization of DNA primary sequences.
Nucleotide composition and distribution along a DNA sequence is known to play a vital role in the determination of gene functions. Protein coding regions, regulatory sequences, and other functional regions are determined generally by homology studies with comparable genes from other species or specific experimental verification. With the rapid and explosive increase in sequence information, new computational techniques for rapid determination of such information and comparative studies of different genes are becoming necessary which ideally should encompass not only DNA sequences but other macromolecular sequences as well.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.