2022
DOI: 10.48550/arxiv.2210.06366
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Generalist Framework for Panoptic Segmentation of Images and Videos

Abstract: Panoptic segmentation assigns semantic and instance ID labels to every pixel of an image. As permutations of instance IDs are also valid solutions, the task requires learning of high-dimensional one-to-many mapping. As a result, state-of-the-art approaches use customized architectures and task-specific loss functions. We formulate panoptic segmentation as a discrete data generation problem, without relying on inductive bias of the task. A diffusion model based on analog bits [12] is used to model panoptic mask… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(19 citation statements)
references
References 46 publications
0
19
0
Order By: Relevance
“…In this work, we empirically study noise scheduling strategies for diffusion models and show their importance. The noise scheduling not only plays an important role in image generation but also for other tasks such as panoptic segmentation [1]. A simple strategy of adjusting input scaling factor [1] works well across different image resolutions.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…In this work, we empirically study noise scheduling strategies for diffusion models and show their importance. The noise scheduling not only plays an important role in image generation but also for other tasks such as panoptic segmentation [1]. A simple strategy of adjusting input scaling factor [1] works well across different image resolutions.…”
Section: Discussionmentioning
confidence: 99%
“…The noise scheduling not only plays an important role in image generation but also for other tasks such as panoptic segmentation [1]. A simple strategy of adjusting input scaling factor [1] works well across different image resolutions. When combined with recently proposed RIN architecture [9], our noise scheduling strategy enables single-stage generation of high resolution images.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…We advocate that using DPMs for the graph-structured prediction can exploit their extraordinary capability in learning highly structured data (Li et al, 2022;Chen et al, 2022b;Hoogeboom et al, 2022b). In particular, our DPM makes a prediction using a GNN-based reverse diffusion process.…”
Section: Introductionmentioning
confidence: 99%