A fundamental question in designing lossy data compression schemes is how well one can do in comparison with the ratedistortion function, which describes the known theoretical limits of lossy compression. Motivated by the empirical success of deep neural network (DNN) compressors on large, real-world data, we investigate methods to estimate the rate-distortion function on such data, which would allow comparison of DNN compressors with optimality. While one could use the empirical distribution of the data and apply the Blahut-Arimoto algorithm, this approach presents several computational challenges and inaccuracies when the datasets are large and high-dimensional, such as the case of modern image datasets. Instead, we re-formulate the rate-distortion objective, and solve the resulting functional optimization problem using neural networks. We apply the resulting rate-distortion estimator, called NERD, on popular image datasets, and provide evidence that NERD can accurately estimate the rate-distortion function. Using our estimate, we show that the rate-distortion achievable by DNN compressors are within several bits of the rate-distortion function for real-world datasets. Additionally, NERD provides access to the rate-distortion achieving channel, as well as samples from its output marginal. Therefore, using recent results in reverse channel coding, we describe how NERD can be used to construct an operational one-shot lossy compression scheme with guarantees on the achievable rate and distortion. Experimental results demonstrate competitive performance with DNN compressors.
Index TermsGenerative models, lossy compression, neural networks, rate-distortion theory, reverse channel coding
I. INTRODUCTIONDriven by advances in deep neural network (DNN) compression schemes, rapid progress has been made in finding highperforming lossy compression schemes for large, high-dimensional datasets that remain practical [1]- [4]. While these methods have empirically shown to outperform classical compression schemes for real-world data (e.g. images), it remains unknown as to how well they perform in comparison to the fundamental limit, which is given by the rate-distortion function. To investigate this question, one approach is to examine a stylized data source with a known probability distribution that is analytically tractable, such as the sawbridge random process, as done in [5]. This allows for a closed-form solution of the rate-distortion function; one can then compare it with empirically achievable rate and distortion of DNN compressors trained on realizations of the source. However, this approach does not evaluate DNN compressors on true sources of interest, such as real-world images, for which architectural choices such as convolutional layers have been engineered [6]. Thus, evaluating the rate-distortion function on these sources is paramount to understanding the efficacy of DNN compressors on real-world data.Furthermore, a class of information-theoretically designed one-shot lossy source codes with near-optimal rate-dist...