Most of existing saliency object detection 1 models are based on fully convolutional networks (FCNs), 2 which learn multi-scale/level semantic information through 3 convolutional layers to obtain high-quality predicted 4 saliency maps. However, convolution is locally interac-5 tive, and thus it is difficult to capture remote depen-6 dencies. Additionally, FCNs-based methods suffer from 7 coarse object boundaries. In this paper, to solve these 8 problems, we propose a novel transformer framework 9 for salient object detection (named TF-SOD), which 10 consists of the encoder part of the FCN, the fusion mod-11 ule (FM), the transformer module (TM) and the feature 12 decoder module (FDM). Specifically, the FM is a bridge 13 connecting the encoder and TM, and provides some 14 foresight for the non-local interaction of the TM. FDM 15 can efficiently decode the non-local features output by 16 the TM, and achieve deep fusion with local features.17This architecture enables the network to achieve a close 18 integration of local and non-local interactions, making 19 information complementary to each other, deeply min-20 ing the associated information between features. Fur-21 thermore, we also propose a novel edge reinforcement 22 learning strategy, which can effectively suppress edge