Modern media data such as 360 • videos and light field (LF) images are typically captured in much higher dimensions than the observers' visual displays. To efficiently browse high-dimensional media over bandwidth-constrained networks, a navigational streaming model is considered: a client navigates the large media space by dictating a navigation path to a server, who in response transmits the corresponding pre-encoded media data units (MDU) to the client one-by-one in sequence. Intracoding an MDU (I-MDU) would result in a large bitrate but I-MDU can be randomly accessed, while inter-coding an MDU (P-MDU) using another MDU as a predictor incurs a small coding cost but imposes an order where the predictor must be first transmitted and decoded. From a compression perspective, the technical challenge is: how to achieve coding gain via inter-coding of MDUs, while enabling adequate random access for satisfactory user navigation.To address this problem, we propose a landmark-based MDU optimization framework with redundant representation-each MDU can be coded both as I-MDU and one or more P-MDUs. The media space is divided into neighborhoods, each containing one landmark (a chosen MDU). MDUs in a neighborhood use the associated landmark as a predictor for inter-coding. Thus, for the transition from one MDU to another in the same neighborhood, it requires only one P-MDU transmission when the landmark is already in the decoder buffer, enabling navigational random access. To optimize an MDU structure, we employ tree-structured vector quantizer (TSVQ) to first optimize landmark locations, then iteratively add/remove P-MDUs as refinements using a fast branch-and-bound technique. Taking interactive LF images and viewport adaptive 360 • images as illustrative applications, and I-, P-and previously proposed merge frames to intra-and inter-code MDUs, we show experimentally that landmarked MDU structures can noticeably reduce the expected transmission cost compared with MDU structures without landmarks.