Guaranteing the quality of face object in video call is the key task for video coding systems under the constrained bandwidth of network. Conventionally, the face object is divided into disperse blocks under the hybrid coding framework. Therefore, the characteristics of the complete face object have not been fully used. Meanwhile, it is difficult to predict the complex affine transformations such as rotation and scaling of the face object in neighboring frames based on the current translation motion model.In this paper, we propose an improved video coding scheme for video call. The complex transformation of complete face object is used to improve the compression effect. Experimental results show that our proposed method has better performance compared with HM12.0, the bits rate saving of face region is up to 19.59% under the similar visual quality .