2002
DOI: 10.1023/a:1014230928565
|View full text |Cite
|
Sign up to set email alerts
|

Information Capacity of Symbol Sequences

Abstract: The information capacity of sequences is considered through the calculation of specific entropy of their frequency dictionary. The specific entropy was calculated against the reconstructed dictionary which bears the most probable continuations of shorter strings. The measure developed allows to distinguish the sequences both from the random ones, and those with high level of (rather simple) order. Some applications of the developed methodology to genetics, bioinformatics, and linguistics are discussed.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2003
2003
2022
2022

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(9 citation statements)
references
References 4 publications
0
9
0
Order By: Relevance
“…(4-6) could easily be extended for the case of derivation of a dictionary e W ðqÞ from W(l), where 2 B l \ q -1. Equation 6 looks like S q ¼ ðq À l þ 1ÞS l À S q À ðq À lÞS q À l and S q ¼ qS 1 À S q for this case; see details in Sadovsky (2002aSadovsky ( , b, 2003Sadovsky ( , 2006. The Eqs.…”
Section: Information Capacity Of a Dictionarymentioning
confidence: 98%
See 1 more Smart Citation
“…(4-6) could easily be extended for the case of derivation of a dictionary e W ðqÞ from W(l), where 2 B l \ q -1. Equation 6 looks like S q ¼ ðq À l þ 1ÞS l À S q À ðq À lÞS q À l and S q ¼ qS 1 À S q for this case; see details in Sadovsky (2002aSadovsky ( , b, 2003Sadovsky ( , 2006. The Eqs.…”
Section: Information Capacity Of a Dictionarymentioning
confidence: 98%
“…Information capacity measured through the calculation of mutual entropy reveals numerous biologically valuable effects in nucleotide sequences (see Sadovsky 2002aSadovsky , b, 2003Sadovsky , 2006. Meanwhile, a study of the structure of frequency dictionary of a genetic entity reveals more peculiarities and biological issues standing behind; here, we investigate a problem of codon usage bias determination (see Codon usage bias), that is a classical problem of molecular biology and genetics.…”
Section: Some More Applicationsmentioning
confidence: 99%
“…Here we follow the second approach that is completely similar to that one used to study the statistical properties of nucleotide sequences [5][6][7][8][9][10][11].…”
Section: Study Of the Real Distribution Of Start Points Of Reads Along A Genomementioning
confidence: 99%
“…where n ω is the number of copies of a word ω, and N is the length of a sequence; to make the definition (2) feasible, one must connect a sequence into a ring, see details in [5][6][7][8][9][10][11]. Such closure results in appearance of q − 1 phantom words in a dictionary, while we neglect them.…”
Section: Simulation Of Start Points Of Reads: Theoretical Backgroundmentioning
confidence: 99%
See 1 more Smart Citation