In this paper, we focus on the problem of novel view synthesis for vehicles. Some previous works solve the problem of novel view synthesis in a controlled 3D environment by exploiting additional 3D details (i.e., camera viewpoints and underlying 3D models). However, in real scenarios, the 3D details are difficult to obtain. In this case, we find that introducing vehicle pose to represent the views of vehicles is an alternative paradigm to solve the lack of 3D details. In novel view synthesis, preserving local details is one of the most challenging problems. To address this problem, we propose a perspective-aware generative model (PAGM). We are motivated by the prior that vehicles are made of quadrilateral planes. Preserving these rigid planes during image generation ensures that image details are kept. To this end, a classic image transformation method is leveraged, i.e., perspective transformation. In our GAN-based system, the perspective transformation is applied to the encoder feature maps, and the resulting maps are regarded as new conditions for the decoder. This strategy preserves the quadrilateral planes all the way through the network, thus shuttling the texture details from the input image to the generated image. In the experiments, we show that PAGM can generate high-quality vehicle images with fine details. Quantitatively, our method is superior to several competing approaches employing either GAN or the perspective transformation.