In the context of registration between videos and geographic information system (GIS)-based 3D building models-for instance in augmented reality applications-we propose a solution to one of the most critical problems, namely the registration initialization. Successful automatic 2D/3D matching is achieved by combining two context-dependent improvements. On one hand, we associate semantic information to the low-level primitives we used to reduce the problem complexity. On the other hand, we avoid false initial registration solutions by analyzing the convergence of the iterative pose computation in a RANSAC framework. We require that videos are acquired together with global positioning system measures. We also present how such a registration can be exploited, once it has been performed for the whole video. Textures of visible buildings are extracted from the images. A new algorithm for façade texture fusion based on statistical analysis of the texels color is presented. It allows us to remove from the final textures all occluding objects in front of the viewed building façades. G. Sourimant (B) · T. Colleu · V. Jantet INRIA Rennes Bretagne Atlantique,