This article starts with the environmental changes in human cognition, analyzes the virtual as the main feature of visual perception under digital technology, and explores the transition from passive to active human cognitive activities. With the diversified understanding of visual information, human contradiction of memory also began to become prominent. Aiming at the problem that the existing multimodal TV media recognition methods have low recognition rate of unknown application layer protocols, an adaptive clustering method for identifying unknown application layer protocols is proposed. This method clusters application layer protocols based on similarity of the load characteristics of network stream application layer protocol data. The method divides the similarity calculation in the clustering algorithm to improve the clustering efficiency of the algorithm. Experimental results show that the proposed method can efficiently and accurately recognize unknown visual communication. This article proposes that, in the interactive multimodal visual information transmission, human visual perception experience has changed, the diversity of visual information content expression makes the aesthetic subject more personalized and stylized.