International audienceWe address shape grammar parsing for facade segmentation using Reinforcement Learning (RL). Shape parsing entails simultaneously optimizing the geometry and the topology (e.g. number of floors) of the facade, so as to optimize the fit of the predicted shape with the responses of pixel-level 'terminal detectors'. We formulate this problem in terms of a Hierarchical Markov Decision Process, by employing a recursive binary split grammar. This allows us to use RL to efficiently find the optimal parse of a given facade in terms of our shape grammar. Building on the RL paradigm, we exploit state aggregation to speedup computation, and introduce image-driven exploration in RL to accelerate convergence. We achieve state-of-the-art results on facade parsing, with a significant speed-up compared to existing methods, and substantial robustness to initial conditions. We demonstrate that the method can also be applied to interactive segmentation, and to a broad variety of architectural styles
International audienceIn this paper we propose a novel approach to the perceptual interpretation of building facades that combines shape grammars, supervised classification and random walks. Procedural modeling is used to model the geometric and the photometric variation of buildings. This is fused with visual classification techniques (randomized forests) that provide a crude probabilistic interpretation of the observation space in order to measure the appropriateness of a procedural generation with respect to the image. A random exploration of the grammar space is used to optimize the sequence of derivation rules towards a semantico-geometric interpretation of the observations. Experiments conducted on complex architecture facades with ground truth validate the approach
State of the art deep generative networks are capable of producing images with such incredible realism that they can be suspected of memorizing training images. It is why it is not uncommon to include visualizations of training set nearest neighbors, to suggest generated images are not simply memorized. We demonstrate this is not sufficient and motivates the need to study memorization/overfitting of deep generators with more scrutiny. This paper addresses this question by i) showing how simple losses are highly effective at reconstructing images for deep generators ii) analyzing the statistics of reconstruction errors when reconstructing training and validation images, which is the standard way to analyze overfitting in machine learning. Using this methodology, this paper shows that overfitting is not detectable in the pure GAN models proposed in the literature, in contrast with those using hybrid adversarial losses, which are amongst the most widely applied generative methods. The paper also shows that standard GAN evaluation metrics fail to capture memorization for some deep generators. Finally, the paper also shows how off-theshelf GAN generators can be successfully applied to face inpainting and face super-resolution using the proposed reconstruction method, without hybrid adversarial losses.
In this paper, we use shape grammars (SGs) for facade parsing, which amounts to segmenting 2D building facades into balconies, walls, windows, and doors in an architecturally meaningful manner. The main thrust of our work is the introduction of reinforcement learning (RL) techniques to deal with the computational complexity of the problem. RL provides us with techniques such as Q-learning and state aggregation which we exploit to efficiently solve facade parsing. We initially phrase the 1D parsing problem in terms of a Markov Decision Process, paving the way for the application of RL-based tools. We then develop novel techniques for the 2D shape parsing problem that take into account the specificities of the facade parsing problem. Specifically, we use state aggregation to enforce the symmetry of facade floors and demonstrate how to use RL to exploit bottom-up, image-based guidance during optimization. We provide systematic results on the Paris building dataset and obtain state-of-the-art results in a fraction of the time required by previous methods. We validate our method under diverse imaging conditions and make our software and results available online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.