Image segmentation of Renal Cell Carcinoma is a crucial prerequisite for pathologists to diagnose the disease and implement treatment. In various previous medical image segmentation tasks, convolutional neural networks(CNNs) have been widely used and have become a practical benchmark with significant success. However, the inherently local nature of the convolution operation leads to its inability to balance long-range relations and local dependencies in the modeling process, resulting in redundancy-deepening networks and the loss of local details. With its innate global self-attention mechanism, transformers were designed for sequence-to-sequence prediction to better extract global information. However, on account of the lack of low-level details, it may cause limited localization abilities. In this paper, we propose a parallel architecture based on Transformers and CNNs, combining Transformers and CNNs in a parallel manner to obtain global and local information by utilizing location-encoded Transformers model and local convolution neural networks, respectively. So that global contextual information can be captured more efficiently while maintaining a strong grasp of low-level spatial detailed information for Renal Cell Carcinoma image segmentation. Through a series of comparative experiments and extension experiments, the effectiveness and progressiveness of our method were validated, demonstrating remarkable potential in the image segmentation task of Renal Cell Carcinoma.