The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023
DOI: 10.1109/cvpr52729.2023.00677
|View full text |Cite
|
Sign up to set email alerts
|

PLA: Language-Driven Open-Vocabulary 3D Scene Understanding

Runyu Ding,
Jihan Yang,
Chuhui Xue
et al.
Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 13 publications
(10 citation statements)
references
References 27 publications
0
6
0
Order By: Relevance
“…ViLD [12] BARON [113] LSeg [14] ZegFormer [114] MaskCLIP [48] OPSNet [115] OpenScene [116] ULIP [117] OVR-CNN [11] OpenSeg [46] CGG [47] DST-Det [118] PB-OVD [119] MaskCLIP+ [120] OVSeg [121] XPM [122] MAXI [123] PLA [124] GroupViT [125] SegCLIP [126] Mask-free OVIS [127] OpenSeeD [128] OpenSD [129] X-Decoder [130] FreeSeg [131] OVDiff [132] ODISE [133] Detic [134] OWLv2 [135] Fig. 4: Open vocabulary learning methods, organized by their tasks and approach types.…”
Section: Preliminarymentioning
confidence: 99%
See 4 more Smart Citations
“…ViLD [12] BARON [113] LSeg [14] ZegFormer [114] MaskCLIP [48] OPSNet [115] OpenScene [116] ULIP [117] OVR-CNN [11] OpenSeg [46] CGG [47] DST-Det [118] PB-OVD [119] MaskCLIP+ [120] OVSeg [121] XPM [122] MAXI [123] PLA [124] GroupViT [125] SegCLIP [126] Mask-free OVIS [127] OpenSeeD [128] OpenSD [129] X-Decoder [130] FreeSeg [131] OVDiff [132] ODISE [133] Detic [134] OWLv2 [135] Fig. 4: Open vocabulary learning methods, organized by their tasks and approach types.…”
Section: Preliminarymentioning
confidence: 99%
“…OV-3DETIC [277] OV-3D-OD Pseduo Labels from 2D Detector CLIP-text 3DDETR [278] OV-3DETIC explois information from two modalities to achieve 3D open vocabulary object detection. PLA [124] OV-3D-SS/IS 3D segmentation masks (base) CLIP-text sparse-conv UNet [279] PLA first tackles the 3D open vocabulary scene understanding problem. OpenScene [116] OV-3D-SS None CLIP-text 3D Encoder + LSeg [14] OpenScene train a 3D Encoder yielding dense features co-embedded with text and image pixels for open vocabulary semantic segmentation.…”
Section: Open Vocabulary Video Understandingmentioning
confidence: 99%
See 3 more Smart Citations