Availability of short amino acid sequences in proteins

Otaki, Joji M.; Ienaka, Shunsuke; Gotoh, Tetsuo; Yamamoto, Hirokazu

doi:10.1110/ps.041092605

Cited by 32 publications

(44 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…1 All penta-peptides that were reported as missing in the work of Otaki and colleagues do appear in our database, which is more up-to-date (and more than four times larger than the data set Otaki and colleagues used) (Otaki et al 2005). 2 Hampikian and Andersen mainly describe a method for fast calculation of k-mers that do not appear in a set of proteins.…”

Section: Preliminaries and Definitionsmentioning

confidence: 99%

“…Many bioinformatic investigations have explored the sequences of amino acids in proteins (see, for example, Gavel and Heijne 1992;Echols et al 2002;Qi et al 2004;White and Heijne 2004;Otaki et al 2005;White and Heijne 2005;Ulitsky et al 2006;Hampikian and Andersen 2007), or have attempted to model proteins by various probability models (Krogh et al 1994;Abe and Mamitsuka 1997;Durbin et al 1998;Bystroff et al 2000;Eddy 2004). Otaki and colleagues 1 (Otaki et al 2005) have examined the space of ''missing'' AA sequences and have discovered the missing penta-peptides.…”

mentioning

confidence: 99%

“…Otaki and colleagues 1 (Otaki et al 2005) have examined the space of ''missing'' AA sequences and have discovered the missing penta-peptides. Their analysis, however, did not take into account the non-coding parts of the genome.…”

mentioning

confidence: 99%

See 2 more Smart Citations

Forbidden penta‐peptides

2007

View full text Add to dashboard Cite

There are 3,200,000 amino acid sequences of length 5 (penta-peptides). Statistically, we expect to see a distribution of penta-peptides that is determined by the frequency of the participating amino acids. We show, however, that not only are there thousands of such penta-peptides that are absent from all known proteomes, but many of them are coded for multiple times in the non-coding genomic regions. This suggests a strong selection process that prevents these peptides from being expressed. We also show that the characteristics of these forbidden penta-peptides vary among different phylogenetic groups (e.g., eukaryotes, prokaryotes, and archaea). Our analysis provides the first steps toward understanding the ''grammar'' of the forbidden penta-peptides.

show abstract

Section: Preliminaries and Definitionsmentioning

confidence: 99%

mentioning

confidence: 99%

See 1 more Smart Citation

Forbidden penta‐peptides

2007

View full text Add to dashboard Cite

show abstract

“…The most basic empirical question that has been investigated is that of missing DNA k-mers. Earlier works have studied non-existent short amino acid k-mers [3,4], and have attributed them mainly to chemical constraints (such as hydrophobic and hydrophilic amino acids). DNA does not have the complex three-dimensional structure and chemical constraints of proteins, although the nucleotide composition has been reported by el antri et al [5] to weakly affect the structure of double-stranded DNA.…”

Section: Introductionmentioning

confidence: 99%

Genomic DNA k-mer Spectra: Models and Modalities

Chor

Horn

Goldman

et al. 2010

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…Short-sequence use has been analyzed previously (17)(18)(19)(20), and different reasons for a lack of some sequences have been suggested. Our bioinformatics results identify triplet and quadruplet sequences that slow translation and lead to stalling almost immediately upon entry into the exit tunnel.…”

Section: Discussionmentioning

confidence: 99%

Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences

Navon

Kornberg

Chen

et al. 2016

Proc. Natl. Acad. Sci. U.S.A.

View full text Add to dashboard Cite

Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel.

show abstract

Availability of short amino acid sequences in proteins

Cited by 32 publications

References 36 publications

Forbidden penta‐peptides

Forbidden penta‐peptides

Genomic DNA k-mer Spectra: Models and Modalities

Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences

Contact Info

Product

Resources

About