G.I. Shamir scite author profile

Three strongly sequential, lossless compression schemes, one with linearly growing per-letter computational complexity, and two with fixed per-letter complexity, are presented and analyzed for memoryless sources with abruptly changing statistics. The first method, which improves on Willems' weighting approach, asymptotically achieves a lower bound on the redundancy, and hence is optimal. The second scheme achieves redundancy of O (log N=N) when the transitions in the statistics are large, and O (log log N= log N) otherwise. The third approach always achieves redundancy of O (log N=N). Obviously, the two fixed complexity approaches can be easily combined to achieve the better redundancy between the two. Simulation results support the analytical bounds derived for all the coding schemes.

show abstract

On the MDL principle for i.i.d. sources with large alphabets

Shamir

2006

IEEE Trans. Inform. Theory

View full text Add to dashboard Cite

Average case universal lossless compression with unknown alphabets

Shamir

View full text Add to dashboard Cite

Universal Lossless Compression With Unknown Alphabets—The Average Case

Shamir

2006

IEEE Trans. Inform. Theory

View full text Add to dashboard Cite

show abstract

Low complexity sequential lossless coding for piecewise stationary memoryless sources

Shamir¹,

Merhav²

View full text Add to dashboard Cite

Abstract-Three strongly sequential, lossless compression schemes, one with linearly growing per-letter computational complexity, and two with fixed per-letter complexity, are presented and analyzed for memoryless sources with abruptly changing statistics. The first method, which improves on Willems' weighting approach, asymptotically achieves a lower bound on the redundancy, and hence is optimal. The second scheme achieves redundancy of O (log N=N ) when the transitions in the statistics are large, and O (log log N= log N ) otherwise. The third approach always achieves redundancy of O ( log N=N ). Obviously, the two fixed complexity approaches can be easily combined to achieve the better redundancy between the two. Simulation results support the analytical bounds derived for all the coding schemes.

show abstract

Sequential universal lossless techniques for compression of patterns and their description length

Shamir

View full text Add to dashboard Cite

Bounds on the entropy of patterns of I.I.D. sequences

Shamir

2005

View full text Add to dashboard Cite

Abstract-Bounds on the entropy of patterns of sequences generated by independently identically distributed (i.i.d.) sources are derived. A pattern is a sequence of indices that contains all consecutive integer indices in increasing order of first occurrence. If the alphabet of a source that generated a sequence is unknown, the inevitable cost of coding the unknown alphabet symbols can be exploited to create the pattern of the sequence. This pattern can in turn be compressed by itself. The bounds derived here are functions of the i.i.d. source entropy, alphabet size, and letter probabilities. It is shown that for large alphabets, the pattern entropy must decrease from the i.i.d. one. The decrease is in many cases more significant than the universal coding redundancy bounds derived in prior works. The pattern entropy is confined between two bounds that depend on the arrangement of the letter probabilities in the probability space. For very large alphabets whose size may be greater than the coded pattern length, all low probability letters are packed into one symbol. The pattern entropy is upper and lower bounded in terms of the i.i.d. entropy of the new packed alphabet. Correction terms, which are usually negligible, are provided for both upper and lower bounds.

show abstract

Non-systematic low-density parity-check codes for nonuniform sources

Shamir

Boutros

2005

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

G.I. Shamir

Low-complexity sequential lossless coding for piecewise-stationary memoryless sources

On the MDL principle for i.i.d. sources with large alphabets

Average case universal lossless compression with unknown alphabets

Universal Lossless Compression With Unknown Alphabets—The Average Case

Low complexity sequential lossless coding for piecewise stationary memoryless sources

Sequential universal lossless techniques for compression of patterns and their description length

Bounds on the entropy of patterns of I.I.D. sequences

Non-systematic low-density parity-check codes for nonuniform sources

Contact Info

Product

Resources

About