In various transformers, positional encoding is used to compensate for the inability of the attention mechanism to capture positional information between words. Previous research on transformers' temporal modeling has utilized recursive and relative positional encoding based on the Recurrent Neural Network (RNN). Recursive positional encoding captures linear text structure but lacks parallelization, hindering speed. In contrast, relative positional encoding ignores linear text structure, leading to weaker performance in short text classification compared to recursive positional encoding. To address the issues, we propose a model, sumformer, which mainly includes two parts different from the other transformers: cumsum calculation and summer initialization. Cumsum calculation simplifies the feature extraction part of RNN by a substitution method, replacing the dynamic rate function of RNNs with static trainable position parameters, and preserves the recursive structure, which enables the model to capture the linear structure information of the text through cumsum calculation method and maintains a low time overhead compared to RNNs. In addition, the summer initialization method, which limits the highest standard deviation of the positional parameter, enables the model to pay attention to the multi-level information of the text during initialization, with the richer optimization space, thereby improving the convergence ability of the model. The experimental results show the sumformer achieves roughly a 3% improvement in performance and a 58% improvement in speed compared to existing transformers based on recursive positional encoding. It achieves better short text classification faster, and summer initialization also can improve the performance without increasing training and inference time.