2018
DOI: 10.1145/3272127.3275035
|View full text |Cite
|
Sign up to set email alerts
|

Language-driven synthesis of 3D scenes from scene databases

Abstract: We introduce a novel framework for using natural language to generate and edit 3D indoor scenes, harnessing scene semantics and text-scene grounding knowledge learned from large annotated 3D scene databases. The advantage of natural language editing interfaces is strongest when performing semantic operations at the sub-scene level, acting on groups of objects. We learn how to manipulate these sub-scenes by analyzing existing 3D scenes. We perform edits by first p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
58
0
2

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 68 publications
(60 citation statements)
references
References 35 publications
0
58
0
2
Order By: Relevance
“…What graph representation maps most naturally to this language? There exists prior work in language-based scene creation [Chang et al 2015[Chang et al , 2014, including recent work that uses a graph-based intermediate representation [Ma et al 2018a]. However, it constructs scenes by retrieving parts of scenes from a database; new possibilities are opened up by a system that can synthesize truly new scenes from a partial graph.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…What graph representation maps most naturally to this language? There exists prior work in language-based scene creation [Chang et al 2015[Chang et al , 2014, including recent work that uses a graph-based intermediate representation [Ma et al 2018a]. However, it constructs scenes by retrieving parts of scenes from a database; new possibilities are opened up by a system that can synthesize truly new scenes from a partial graph.…”
Section: Resultsmentioning
confidence: 99%
“…Followup work has used undirected factor graphs learned from annotated RGB-D images [Kermani et al 2016], relation graphs between objects learned from human activity annotations [Fu et al 2017], and directed graphical models with Gaussian mixtures for modeling arrangement patterns [Paul Henderson 2018]. Other work has focused on conditioning the scene generation using input from RGB-D frames [Chen et al 2014], 2D sketches of the scene [Xu et al 2013], natural language text [Chang et al 2015;Ma et al 2018b], or activity predictions on RGB-D reconstructions [Fisher et al 2015].…”
Section: Background and Related Workmentioning
confidence: 99%
“…Users can define their preference for small object arrangement interactively with our framework by manipulating the spatial relations between any two small objects. However, at the current stage, our framework cannot support advanced spatial relations such as "surrounded by" [37], which cannot be simply interpreted as multiple pairwise relations. It would be interesting to include these relations by proposing another effective data form, e.g.…”
Section: Discussion and Future Workmentioning
confidence: 96%
“…In terms of input, earlier works on probabilistic models, e.g., [FRS∗12], generates a new scene by taking a random sample from a learned distribution, while recent works on deep generative neural networks, e.g., [LPX∗19], can produce a novel scene from a random noise vector. The input can also be a hand sketch [XCF∗13], a photograph [ISS17, LZW∗15], natural language commands [MGPF∗18], or human actions/activities [FLS∗15, MLZ∗16]. In terms of output, while most methods have been designed to generate room layouts with 3D furniture objects, some methods learn to produce floor or building plans [MSK10, WFT∗19].…”
Section: Application: Indoor Scene Synthesismentioning
confidence: 99%