“…Several prior works [50,56,59,44] maximize MI objectives that closely resemble the forward information objective we introduce in Section 4, whiel others optimize related objectives by learning latent forward dynamics models [69,33,73,26,39]. Multi-step inverse models, closely related to the inverse information objective (Section 4), have been used to learn control-centric representations [70,23]. Single-step inverse models have been deployed as regularization of forward models [72,2] and as an auxiliary loss for policy gradient RL [57,52].…”