2022
DOI: 10.1609/aaai.v36i1.19961
|View full text |Cite
|
Sign up to set email alerts
|

Laneformer: Object-Aware Row-Column Transformers for Lane Detection

Abstract: We present Laneformer, a conceptually simple yet powerful transformer-based architecture tailored for lane detection that is a long-standing research topic for visual perception in autonomous driving. The dominant paradigms rely on purely CNN-based architectures which often fail in incorporating relations of long-range lane points and global contexts induced by surrounding objects (e.g., pedestrians, vehicles). Inspired by recent advances of the transformer encoder-decoder architecture in various vision tasks,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 23 publications
(3 citation statements)
references
References 24 publications
(47 reference statements)
0
3
0
Order By: Relevance
“…Such as, CLRNet [31] employs learnable anchor parameters, started from x 0 , y 0 , θ 0 , and runway length l 0 . LaneFormer [32] introduces a novel Transformer with x and y axis attention mechanism to detect runway instances. UFLD [5] leverages the flattening operation to convert the 2D feature map into a 1D vector, and then the row-wise runway point positions are detected by the 1D feature map.…”
Section: Runway Detectionmentioning
confidence: 99%
“…Such as, CLRNet [31] employs learnable anchor parameters, started from x 0 , y 0 , θ 0 , and runway length l 0 . LaneFormer [32] introduces a novel Transformer with x and y axis attention mechanism to detect runway instances. UFLD [5] leverages the flattening operation to convert the 2D feature map into a 1D vector, and then the row-wise runway point positions are detected by the 1D feature map.…”
Section: Runway Detectionmentioning
confidence: 99%
“…A lane detection method with a row-column self-attention module using a transformer structure is proposed in [7]. The input image is first passed through a ResNet backbone to obtain row and column features.…”
Section: Deep Learning Modelsmentioning
confidence: 99%
“…Among these methods, the most widely used are end-to-end networks with convolutional encoders and decoders [21,32,33]. Some studies improve the structure of convolution to obtain better results [6,25,32], while different attention mechanisms are introduced in [4,7,29,33] to achieve certain progress. Emerging graph neural networks have also been applied [10], and the processing of continuous images is also under study [4].…”
Section: Introductionmentioning
confidence: 99%