Mixing Strategies in Data Compression

Mattern, Christopher

doi:10.1109/dcc.2012.40

Cited by 11 publications

(15 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Geometric mixing is an adaptive online ensemble that was analyzed in depth and whose properties are described in [24][25][26]. The main difference to linear mixing, which implies weighting the probabilities directly, is that the probabilities are first transformed into the logistic domain using the logit function (sometimes referred to as stretch in the paper).…”

Section: Context Mixingmentioning

confidence: 99%

Improving Lossless Image Compression with Contextual Memory

Dorobanțiu

Brad

2019

Applied Sciences

View full text Add to dashboard Cite

With the increased use of image acquisition devices, including cameras and medical imaging instruments, the amount of information ready for long term storage is also growing. In this paper we give a detailed description of the state-of-the-art lossless compression software PAQ8PX applied to grayscale image compression. We propose a new online learning algorithm for predicting the probability of bits from a stream. We then proceed to integrate the algorithm into PAQ8PX’s image model. To verify the improvements, we test the new software on three public benchmarks. Experimental results show better scores on all of the test sets.

show abstract

Section: Context Mixingmentioning

confidence: 99%

Improving Lossless Image Compression with Contextual Memory

Dorobanțiu

Brad

2019

Applied Sciences

View full text Add to dashboard Cite

show abstract

“…In statistical data compression, one modeling approach used by many high performance programs is to use an ensemble method to combine the predictions of multiple statistical models (Mattern, 2012). Each model is typically tailored towards a particular kind of structure that occurs in popular file types.…”

Section: Introductionmentioning

confidence: 99%

On Ensemble Techniques for AIXI Approximation

Veness

Sunehag

Hutter

2012

Artificial General Intelligence

View full text Add to dashboard Cite

One of the key challenges in AIXI approximation is model class approximation -i.e. how to meaningfully approximate Solomonoff Induction without requiring an infeasible amount of computation? This paper advocates a bottom-up approach to this problem, by describing a number of principled ensemble techniques for approximate AIXI agents. Each technique works by efficiently combining a set of existing environment models into a single, more powerful model. These techniques have the potential to play an important role in future AIXI approximations.

show abstract

“…Moreover, we add a theoretic justification and code length guarantees to PAQ's ad-hoc neural network mixing, since we show that it is a special form of the Geometric Mixture Distribution coupled with Online Gradient Descent. The results in Chapter 3 are a polished version of results published earlier in [58,59].…”

Section: Chapter 3: Elementary Modelingmentioning

confidence: 99%

“…Most of the results from this chapter have previously been published in [58,59]. In this chapter we present a greatly polished version thereof.…”

Section: Our Contributionmentioning

confidence: 99%

See 1 more Smart Citation

Statistical Data Compression

Mattern¹

SpringerReference

View full text Add to dashboard Cite

The ongoing evolution of hardware leads to a steady increase in the amount of data that is processed, transmitted and stored. Data compression is an essential tool to keep the amount of data manageable. Furthermore, techniques from data compression have many more applications beyond compression, for instance data clustering, classification and time series prediction.In terms of empirical performance statistical data compression algorithms rank among the best. A statistical data compressor processes an input text letter by letter and performs compression in two stages -modeling and coding. During modeling a model estimates a probability distribution on the next letter based on the past input. During coding an encoder translates this probability distribution and the next letter into a codeword. Decoding reverts this process. Note that the model is exchangeable and its actual choice determines a statistical data compression algorithm. All major models use a mixer to combine multiple simple probability estimators, so-called elementary models.In statistical data compression there is an increasing gap between theory and practice. On the one hand, the "theoretician's approach" puts emphasis on models that allow for a mathematical code length analysis to evaluate their performance, but neglects running time and space considerations and empirical improvements. On the other hand the "practitioner's approach" focuses on the very reverse. The family of PAQ statistical compressors demonstrated the superiority of the "practitioner's approach" in terms of empirical compression rates.With this thesis we attempt to bridge the aforementioned gap between theory and practice with special focus on PAQ. To achieve this we apply the theoretician's tools to practitioner's approaches: We provide a code length analysis for several common and practical modeling and mixing techniques. The analysis covers modeling by relative frequencies with frequency discount and modeling by exponential smoothing of probabilities. For mixing we consider linear and geometrically weighted averaging of probabilities with Online Gradient Descent for weight estimation. Our results show that the models and mixers we consider perform nearly as well as idealized competitors that may adapt to the input. Experiments support our analysis. Moreover, our results add a theoretical justification to modeling and mixing from PAQ and generalize methods from PAQ. ii

show abstract

Mixing Strategies in Data Compression

Cited by 11 publications

References 7 publications

Improving Lossless Image Compression with Contextual Memory

Improving Lossless Image Compression with Contextual Memory

On Ensemble Techniques for AIXI Approximation

Statistical Data Compression

Contact Info

Product

Resources

About