2017 IEEE International Conference on Computer Vision (ICCV) 2017
DOI: 10.1109/iccv.2017.126
|View full text |Cite
|
Sign up to set email alerts
|

Abstract: Training a feed-forward network for fast neural style transfer of images is proven to be successful. However, the naive extension to process video frame by frame is prone to producing flickering results. We propose the first end-toend network for online video style transfer, which generates temporally coherent stylized video sequences in near realtime. Two key ideas include an efficient network by incorporating short-term coherence, and propagating short-term coherence to long-term, which ensures the consisten… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
212
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 257 publications
(212 citation statements)
references
References 56 publications
(166 reference statements)
0
212
0
Order By: Relevance
“…Temporal Consistency Loss. To efficiently consider temporal coherency, we also impose a temporal consistency loss [48] which explicitly penalizes the color change along the flow trajectory:…”
Section: Lossmentioning
confidence: 99%
“…Temporal Consistency Loss. To efficiently consider temporal coherency, we also impose a temporal consistency loss [48] which explicitly penalizes the color change along the flow trajectory:…”
Section: Lossmentioning
confidence: 99%
“…Our method is similar to the last approach in that we use aligned features to obtain temporally consistent results. In contrast to [1], [50], we use a memory unit that filters out or retains spatiotemporal features aligned along dense trajectories. More recently, V. Miclea et al propose to exploit temporal cues [10] for depth prediction.…”
Section: Temporal Coherencymentioning
confidence: 99%
“…D EPTH prediction from images plays a significant role in autonomous driving and advanced driver assistance systems, which helps understanding a geometric layout in a scene, and can be leveraged to solve other tasks, including vehicle/pedestrian detection [1], [2], traffic scene segmentation [3], and 3D reconstruction [4]. Stereo matching is a typical approach to recovering depth that finds dense correspondences between a pair of stereo images [5], [6], [7].…”
Section: Introductionmentioning
confidence: 99%
“…The use of the Gram matrix to encode style similarity was the core insight in this work, and fundamentally underpins all subsequent work on deep learning for stylization. For example, recent work seeking to stylize video notes that instability in the Gram matrix over time is directly correlated with temporal incoherence (flicker) and seeks to minimize that explicitly in the optimization [88].…”
Section: A Visual Stylizationmentioning
confidence: 99%
“…Prisma, deepart.io) in the social media space. A discussion of feed-forward architectures for visual stylization is beyond the scope of this tutorial, but we note that contemporary architectures for video stylization integrate two stream networks (again, pre-trained) -one branch dealing with image, the other optical flow, with the latter integrated into the loss function to minimise flicker [88], [90].…”
Section: A Visual Stylizationmentioning
confidence: 99%