Due to the lack of sufficient training samples, it is not possible to accurately identify the visual relationships of concrete continuous beam bridge construction scenes, resulting in a low detection recall rate. Therefore, a new visual relationship detection method for building scenes is proposed. Perform feature fusion and optimization processing on the construction scene, extract unique features and spatial distribution features of the construction scene, use an improved ant lion algorithm to simulate the walking motion of ants searching for food, establish a matrix, obtain local optimal solutions, and then detect the visual relationship of the construction scene. Experimental analysis shows that after the application of the new method, the recall rate of visual relationship detection has significantly improved, reaching over 96%.