Portraying urban functional zones provides useful insights into understanding complex urban systems and establishing rational urban planning. Although several studies have confirmed the efficacy of remote sensing imagery in urban studies, coupling remote sensing and new human sensing data like mobile phone positioning data to identify urban functional zones has still not been investigated. In this study, a new framework integrating remote sensing imagery and mobile phone positioning data was developed to analyze urban functional zones with landscape and human activity metrics. Landscapes metrics were calculated based on land cover from remote sensing images. Human activities were extracted from massive mobile phone positioning data. By integrating them, urban functional zones (urban center, sub-center, suburbs, urban buffer, transit region and ecological area) were identified by a hierarchical clustering. Finally, gradient analysis in three typical transects was conducted to investigate the pattern of landscapes and human activities. Taking Shenzhen, China, as an example, the conducted experiment shows that the pattern of landscapes and human activities in the urban functional zones in Shenzhen does not totally conform to the classical urban theories. It demonstrates that the fusion of remote sensing imagery and human sensing data can characterize the complex urban spatial structure in Shenzhen well. Urban functional zones have the potential to act as bridges between the urban structure, human activity and urban planning policy, providing scientific support for rational urban planning and sustainable urban development policymaking.