Information blast makes it harder for clients to channel the substance they are keen on. This study aims to combine big data and video semantic comprehension technology to realize the recommendation of sports culture videos by exploring the semantics of video and taking advantage of multisource heterogeneous information. The semantic structure of unstructured video data is defined first, and on this basis, Converse3D (C3D) -Connectionist Temporal Asifationon (CTC) is employed to complete the extraction of sub-action semantics and the integration of behaviour semantic sequences. In adjustment to break the botheration of low accurateness of the model for the semantic abstraction of unlabeled videos, this study proposes an unsupervised semantic abstraction adjustment based on Converse3D(C3D)-RAE, which completes the compression and affiliation of the semantic sequences and verifies the accurateness of both two models through experiments. In order to solve the problem of insufficient accuracy of video recommendation algorithms based on single video semantic similarity and topic similarity, this study comprehensively considers video semantic similarity and video topic similarity and proposes a multi-modal video recommendation algorithm. The experimental results show that the accuracy of the COMSIMbased algorithm is 7.8% higher than that of Video+ CNN + K-NearestNeighbor (KNN) and 15.9% higher than that of CLIP + CNN +Ncut+LDA.