Online social network platforms contain massive amounts of information propagated in cascading form. It is important to predict the future growth of the information cascade, especially in early stages, to control the information propagation effectively and promptly. The existing studies don't explicitly leverage temporal information, and only use network structure and sequence feature. However, the network structure of different cascade subgraphs varies insignificantly, due to the short cascade sequences in the early stages of information propagation. Therefore, temporal feature is more important for the prediction task. CasTemporalGCN utilizes the position encoding function to encode the temporal information. Our model captures the cascade subgraphs structure information by GCN (Graph Convolutional Network), and feeds it into multi-layer self-attention encoders instead of using classical RNN (Recurrent Neural Network) model. The temporal feature is added to encoders by residual connections. Experimental results show that using temporal information significantly reduces the prediction loss by 13.8% on real Weibo dataset and 24.7% on ACM dataset respectively than state-of-the-art methods. In the ablation study, temporal information reduces the prediction loss by 19.0% for the encode structure and 10.1% for the RNN structure.