2022
DOI: 10.1016/j.eswa.2022.117275
|View full text |Cite
|
Sign up to set email alerts
|

A multi-head attention-based transformer model for traffic flow forecasting with a comparative analysis to recurrent neural networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 100 publications
(33 citation statements)
references
References 29 publications
0
16
0
Order By: Relevance
“…However, the outcomes of the transformer run in the experimental instance are not indisputably better than those of the LSTM and GRU models. In Selim's research [24] about the traffic flow forecasting method, the MAPE of the LSTM model achieved 12.37%, the MAPE of the GRU model reached 12.66% and the best forecast outcome's MAPE in this study was 21.33%. One of the study's research weaknesses is the dearth of historical passenger flow data, which contributed to the fact that the effect of MAPE in the tests was not very noteworthy.…”
Section: Discussionmentioning
confidence: 50%
See 1 more Smart Citation
“…However, the outcomes of the transformer run in the experimental instance are not indisputably better than those of the LSTM and GRU models. In Selim's research [24] about the traffic flow forecasting method, the MAPE of the LSTM model achieved 12.37%, the MAPE of the GRU model reached 12.66% and the best forecast outcome's MAPE in this study was 21.33%. One of the study's research weaknesses is the dearth of historical passenger flow data, which contributed to the fact that the effect of MAPE in the tests was not very noteworthy.…”
Section: Discussionmentioning
confidence: 50%
“…These researchers have achieved rich results, including long, medium and short term prediction of traffic station flow [20], transportation mode flow [18,21] and traffic networks [19] based on massive amounts of traffic data, based on the widely used LSTM, GRU and other algorithms. Lately, the transformer algorithm has been better used in traffic timeseries prediction [22]; it can train the model by removing the spatiotemporal characteristics of traffic data [23], and also helps with the dependence issue within long series data processing [24].…”
Section: Introductionmentioning
confidence: 99%
“…A recurrent neural network, in terms of network architecture, keeps track of prior data and applies that data to affect the output of subsequent nodes. In other words, a RNN's hidden layers are interconnected, and their inputs contain both the outputs of the input layers and the outputs of the hidden levels from earlier in time [ 27 ]. The RNN complex can be conceptualized as the outcome of endless replication of the same neural network structure.…”
Section: Models and Evaluation Methodsmentioning
confidence: 99%
“…Then, the resulting series is re-weighted using a MultiHead Attention layer and added to the previous result. This operation of adding the result of a layer with its input is a common practice known as "residual connections" and is commonly used with convolutional layers for image processing (He et al, 2015) and for time-series processing with Attention layers (Reza et al, 2022;Vaswani et al, 2017). However, it is important to address a key difference between our model and other works with similar architectures: after the MultiHead Attention layers and the residual addition, a Layer Normalization (Ba et al, 2016) operation is usually applied.…”
Section: Deep Neural Network Architecturementioning
confidence: 99%