We investigate the implications of the conventional "t+2-D" motion-compensated (MC) three-dimensional (3-D) discrete wavelet/subband transform structure for spatial scalability and propose a novel flexible structure for fully scalable video compression. In this structure, any number of levels of "pretemporal" spatial wavelet decomposition are performed on the original full resolution frames, followed by MC temporal decomposition of the subbands within each spatial resolution level. Further levels of "posttemporal" spatial decomposition may be performed on the spatiotemporal subbands to provide additional levels of spatial scalability and energy compaction. This structure allows us to trade energy compaction against the potential for artifacts at reduced spatial resolutions. More importantly, the structure permits extensive study of the interaction between spatial aliasing, scalability and energy compaction. We show that where the motion model fails, the "t+2-D" structure inevitably produces misaligned spatial aliasing artifacts in reduced resolution sequences. These artifacts can be removed by using pretemporal spatial decomposition. On the other hand, we also show that the "t+2-D" structure necessarily maximizes compression efficiency. We propose different schemes to minimize the loss of compression efficiency associated with pretemporal spatial decomposition.