Positron Emission Tomography and Computed Tomography(PET/CT) imaging could obtain functional metabolic feature information and anatomical localization information of the patient body. However, tumor segmentation in PET/CT images is significantly challenging for fusing of dual-modality characteristic information. In this work, we have proposed a novel deep learning-based graph model network which can automatically fuse dual-modality information for tumor area segmentation. Our method rationally utilizes the advantage of each imaging modality(PET: the superior contrast, CT: the superior spatial resolution). We formulate this task as a Conditional Random Field(CRF) based on multi-scale fusion and dual-modality co-segmentation of object image with a normalization term which balances the segmentation divergence between PET and CT. This mechanism considers that the spatial varying characteristics acquire different scales, which encode various feature information over different modalities. The ability of our method was evaluated to detect and segment tumor regions with different fusion approaches using a dataset of PET/CT clinical tumor images. The results illustrated that our method effectively integrates both PET and CT modalities information, deriving segmentation accuracy result of 0.86 in DSC and the sensitivity of 0.83, which is 3.61\% improvement compared to the W-Net.