Deep learning-based methods have made significant breakthroughs in real-time inference video super-resolution in recent years. However, these methods are prone to blurring, unnatural textures, and other distortion problems in super-resolution reconstructed video when dealing with complex environments and large-motion video scenes, which greatly affects the efficiency of video super-resolution reconstruction. Moreover, the real-time inference video super-resolution network training based on generative adversarial relies only on the simple modeling of features, which can obtain excellent subjective perceptual quality, but the objective index is lower has artifacts. To this end, in real-time inference video super-resolution, a dual-contrast adaptive network called DCANet is proposed to fully capture the motion offsets of neighboring frames through an adaptive optical flow network, which helps in accurate alignment. Real-time inference video super-resolution training based on generative adversarial networks relies only on simple modeling of features with poor objective metrics and artifacts. A dual-channel feature extraction module is proposed to acquire neighboring frame context features to achieve deep feature modeling. To further enhance the reconstruction quality, contrast learning is combined with the adversarial mechanism in the training strategy and a dual-contrast loss function is proposed to guide the network training. Extensive experiments on multiple benchmark testing sets of complex video scenes show that our method achieves an inference latency of 14.91ms while generating high fidelity and perceptual quality reconstructed videos, which is suitable for real-time inference in practical deployments. The code is available at https://github.com/Swaggyp1sz/DCANet.