Segmentation of nasopharyngeal carcinoma (NPC) from Magnetic ResonanceImages (MRI) is a crucial prerequisite for NPC radiotherapy. However, manually segmenting of NPC is time-consuming and labor-intensive. Additionally, single-modality MRI generally cannot provide enough information for its accurate delineation. Therefore, a multi-modality MRI fusion network (MMFNet), which is a novel framework to fuse information from multi-modality medical images, is proposed to utilize MRI of T1, T2 and contrast-enhanced T1 to complete accurate segmentation of NPC. The backbone of MMFNet is designed as a multi-encoder-based network, consisting of several encoders to capture modality-specific features and one decoder to obtain fused features for NPC segmentation. A fusion block is presented to effectively fuse multi-source features.It contains a 3D Convolutional Block Attention Module (3D-CBAM), recalibrating low-level features captured from modality-specific encoders to highlight both informative features and regions of interest (ROIs), and a residual fusion block (RFBlock), which fuses re-weighted features to keep balance between fused ones and high-level features from decoder. Moreover, in order to make full mining of individual information from multi-modality MRI, a training strategy named self- * transfer is proposed to utilize pre-trained modality-specific encoders to initialize multi-encoder-based network. The proposed method based on multi-modality MRI can effectively segment NPC and its advantages are validated by extensive experiments.