The coastal environment is characterized by high, multi-scale dynamics and the corresponding observations from a single remote sensing sensor are still facing challenges in achieving both high temporal and spatial resolution. This study proposed a spatiotemporal fusion model for coastal environments, which could fully enhance the efficiency of remote sensing data use and overcome the shortcomings of traditional spatiotemporal models that are insensitive to small-scale disturbances. The Enhanced Deep Super-Resolution Network (EDSR) was used to reconstruct spatial features in the lower spatial resolution GOCI-II data. The spatial features obtained instead of GOCI-II data were fed into the spatiotemporal fusion model, which enabled the fusion data to achieve an hour-by-hour observation of the water color and morphology information changes at 30 m resolution, including the changes in the spatial and temporal distributions of suspended particulate matter (SPM), the characterization of the vortex street caused by the bridge piers, the inundation process of the tidal flats, and coastline changes. In addition, this study analyzed the various factors affecting fusion accuracy, including spectral difference, errors in both temporal difference and location distance, and the structure of the EDSR model on the fusion accuracy. It is demonstrated that the location distance error and the spectral difference have the most significant impact on the fusion data, which may lead to the introduction of some ambiguous or erroneous spatial features.