2021
DOI: 10.1109/tpami.2019.2956930
|View full text |Cite
|
Sign up to set email alerts
|

Visual Scanpath Prediction Using IOR-ROI Recurrent Mixture Density Network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
37
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 44 publications
(41 citation statements)
references
References 62 publications
0
37
0
Order By: Relevance
“…The IOR-ROI-LSTM model (Sun et al 2019) uses deep learning models to encode an image into deep features and predicted semantic masks. This data is then feed into a recurrent neural network architecture which uses channel-wise attention to inhibit certain image features and predict the next region of interest via a mixture of Gaussians.…”
Section: Ior-roi-lstmmentioning
confidence: 99%
See 1 more Smart Citation
“…The IOR-ROI-LSTM model (Sun et al 2019) uses deep learning models to encode an image into deep features and predicted semantic masks. This data is then feed into a recurrent neural network architecture which uses channel-wise attention to inhibit certain image features and predict the next region of interest via a mixture of Gaussians.…”
Section: Ior-roi-lstmmentioning
confidence: 99%
“…Several existing scanpath models draw multiple candidates from some distribution, compute a gain for each candidate and then select the candidate with the highest gain, e.g., Boccignone and Ferraro (2004), Le Meur and Liu (2015) and Sun et al (2019). This pattern, which we will call best-of-k sampling in the following, is convenient for sampling fixations but it makes it nontrivial to compute the full conditional model distribution p(f i | f 0 , .…”
Section: Best-of-k Samplingmentioning
confidence: 99%
“…It is now widely admitted that when an image is viewed, the HVS gazes on salient details, which translates into eye fixations [14]. In our case, these regions are considered as our viewports and are detected using the visual scan-path model proposed in [15]. This model provides trajectories including the order and duration of fixations.…”
Section: Pre-processingmentioning
confidence: 99%
“…At present, there exist multiple ways to model cross-data relationships, e.g., recurrent neural network (RNN) [64], long short-term memory (LSTM) [65], [66], and gate recurrent unit (GRU) [67]. Although these existing tools can sense subtle changes over time, they are clearly not suitable in our case because these tools require their input data to be spatially well aligned.…”
Section: Multigranularity Perceptionmentioning
confidence: 99%