Alexander Kirillov scite author profile

We present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components like a non-maximum suppression procedure or anchor generation that explicitly encode our prior knowledge about the task. The main ingredients of the new framework, called DEtection TRansformer or DETR, are a set-based global loss that forces unique predictions via bipartite matching, and a transformer encoder-decoder architecture. Given a fixed small set of learned object queries, DETR reasons about the relations of the objects and the global image context to directly output the final set of predictions in parallel. The new model is conceptually simple and does not require a specialized library, unlike many other modern detectors. DETR demonstrates accuracy and run-time performance on par with the well-established and highly-optimized Faster R-CNN baseline on the challenging COCO object detection dataset. Moreover, DETR can be easily generalized to produce panoptic segmentation in a unified manner. We show that it significantly outperforms competitive baselines. Training code and pretrained models are available at https://github.com/facebookresearch/detr.

show abstract

Lectures on Tensor Categories and Modular Functors

Bakalov

Kirillov

2000

598

1,282

View full text Add to dashboard Cite

Panoptic Segmentation

et al. 2019

View full text Add to dashboard Cite

We propose and study a task we name panoptic segmentation (PS). Panoptic segmentation unifies the typically distinct tasks of semantic segmentation (assign a class label to each pixel) and instance segmentation (detect and segment each object instance). The proposed task requires generating a coherent scene segmentation that is rich and complete, an important step toward real-world vision systems. While early work in computer vision addressed related image/scene parsing tasks, these are not currently popular, possibly due to lack of appropriate metrics or associated recognition challenges. To address this, we propose a novel panoptic quality (PQ) metric that captures performance for all classes (stuff and things) in an interpretable and unified manner. Using the proposed metric, we perform a rigorous study of both human and machine performance for PS on three existing datasets, revealing interesting insights about the task. The aim of our work is to revive the interest of the community in a more unified view of image segmentation.

show abstract

Panoptic Feature Pyramid Networks

et al. 2019

View full text Add to dashboard Cite

The recently introduced panoptic segmentation task has renewed our community's interest in unifying the tasks of instance segmentation (for thing classes) and semantic segmentation (for stuff classes). However, current state-ofthe-art methods for this joint task use separate and dissimilar networks for instance and semantic segmentation, without performing any shared computation. In this work, we aim to unify these methods at the architectural level, designing a single network for both tasks. Our approach is to endow Mask R-CNN, a popular instance segmentation method, with a semantic segmentation branch using a shared Feature Pyramid Network (FPN) backbone. Surprisingly, this simple baseline not only remains effective for instance segmentation, but also yields a lightweight, topperforming method for semantic segmentation. In this work, we perform a detailed study of this minimally extended version of Mask R-CNN with FPN, which we refer to as Panoptic FPN, and show it is a robust and accurate baseline for both tasks. Given its effectiveness and conceptual simplicity, we hope our method can serve as a strong baseline and aid future research in panoptic segmentation.

show abstract

On a q-Analogue of the McKay Correspondence and the ADE Classification of sl̂2 Conformal Field Theories

Kirillov

Ostrik

2002

Advances in Mathematics

261

454

View full text Add to dashboard Cite

The goal of this paper is to give a category theory based definition and classification of ''finite subgroups in U q ðsl 2 Þ'' where q ¼ e pi=l is a root of unity. We propose a definition of such a subgroup in terms of the category of representations of U q ðsl 2 Þ; we show that this definition is a natural generalization of the notion of a subgroup in a reductive group, and that it is also related with extensions of the chiral (vertex operator) algebra corresponding to b sl sl 2 at level k ¼ l À 2: We show that ''finite subgroups in U q ðsl 2 Þ'' are classified by Dynkin diagrams of types A n ; D 2n ; E 6 ; E 8 with Coxeter number equal to l; give a description of this correspondence similar to the classical McKay correspondence, and discuss relation with modular invariants in ð b sl sl 2 Þ k conformal field theory. The results we get are parallel to those known in the theory of von Neumann subfactors, but our proofs are independent of this theory. # 2002 Elsevier Science (USA)

show abstract

REPRESENTATIONS Of THE ALGEBRA U_q(sl(2)), q-ORTHOGONAL POLYNOMIALS AND INVARIANTS OF LINKS

Kirillov¹,

Reshetikhin²

1990

262

423

View full text Add to dashboard Cite

PointRend: Image Segmentation As Rendering

Kirillov

et al. 2020

668

430

View full text Add to dashboard Cite

Masked-attention Mask Transformer for Universal Image Segmentation

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.