Shmuel T. Klein scite author profile

A simple parallel algorithm for decoding a Huffman encoded file is presented, exploiting the tendency of Huffman codes to resynchronize quickly, i.e. recovering after possible decoding errors, in most cases. The average number of bits that have to be processed until synchronization is analyzed and shows good agreement with empirical data. As Huffman coding is also a part of the JPEG image compression standard, the suggested algorithm is then adapted to the parallel decoding of JPEG files.

show abstract

Robust universal complete codes for transmission and compression

Fraenkel

Klein

1996

Discrete Applied Mathematics

View full text Add to dashboard Cite

The design of a similarity based deduplication system

Aronovich¹,

Asher²,

Bachmat

et al. 2009

View full text Add to dashboard Cite

We describe some of the design choices that were made during the development of a fast, scalable, inline, deduplication device. The system's design goals and how they were achieved are presented. This is the firs deduplication device that uses similarity matching. The paper provides the following original research contributions: we show how similarity signatures can serve in a deduplication scheme; a novel type of similarity signatures is presented and its advantages in the context of deduplication requirements are explained. It is also shown how to combine similarity matching schemes with byte by byte comparison or hash based identity schemes.

show abstract

On the Usefulness of Fibonacci Compression Codes

Klein

Ben-Nissan

2009

The Computer Journal

View full text Add to dashboard Cite

Recent publications advocate the use of various variable length codes for which each codeword consists of an integral number of bytes in compression applications using large alphabets. This paper shows that another tradeoff with similar properties can be obtained by Fibonacci codes. These are fixed codeword sets, using binary representations of integers based on Fibonacci numbers of order m ≥ 2. Fibonacci codes have been used before, and this paper extends previous work presenting several novel features. In particular, the compression efficiency is analyzed and compared to that of dense codes, and various table-driven decoding routines are suggested.

show abstract

Is Huffman coding dead?

Bookstein

Klein

1993

Computing

View full text Add to dashboard Cite

Parallel Lempel Ziv coding

Klein

Wiseman

2005

Discrete Applied Mathematics

View full text Add to dashboard Cite

We explore the possibility of using multiple processors to improve the encoding and decoding times of Lempel-Ziv schemes. A new layout of the processors, based on a full binary tree, is suggested and it is shown how LZSS and LZW can be adapted to take advantage of such parallel architectures. The layout is then generalized to higher order trees. Experimental results show an improvement in compression over the standard method of parallelization and an improvement in time over the sequential method.

show abstract

Compression of correlated bit-vectors

Bookstein

Klein

1991

Information Systems

View full text Add to dashboard Cite

Storing text retrieval systems on CD-ROM: compression and encryption considerations

Klein

Bookstein

Deerwester

1989

ACM Trans. Inf. Syst.

View full text Add to dashboard Cite

The emergence of the CD-ROM as a storage medium for full-text databases raises the question of the maximum size database that can be contained by this medium. As an example, the problem of storing the Tresor de la Langue Francaise on a CD-ROM is examined in this paper. The text alone of this database is 700 megabytes long, more than a CD-ROM can hold. In addition, the dictionary and concordance needed to access these data must be stored. A further constraint is that some of the material is copyrighted, and it is desirable that such material be difficult to decode except through software provided by the system. Pertinent approaches to compression of the various files are reviewed, and the compression of the text is related to the problem of data encryption: Specifically, it is shown that, under simple models of text generation, Huffman encoding produces a bit-string indistinguishable from a representation of coin flips.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shmuel T. Klein

Parallel Huffman Decoding with Applications to JPEG Files

Robust universal complete codes for transmission and compression

The design of a similarity based deduplication system

On the Usefulness of Fibonacci Compression Codes

Is Huffman coding dead?

Parallel Lempel Ziv coding

Compression of correlated bit-vectors

Storing text retrieval systems on CD-ROM: compression and encryption considerations

Contact Info

Product

Resources

About