“…[28] provides tighter bounds by considering the individual sample mutual information, [25,29] propose using chaining mutual information, and [30,31,32] advocate the conditioning and processing techniques. Information-theoretic generalization error bounds using other information quantities are also studied, such as, f -divergence [33], α-Rényi divergence and maximal leakage [34,35], and Jensen-Shannon divergence [36,37]. Using rate-distortion theory, [38,39,40] provide information-theoretic generalization error upper bounds for model misspecification and model compression.…”