2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
DOI: 10.1109/cvpr.2017.113
|View full text |Cite
|
Sign up to set email alerts
|

Temporal Convolutional Networks for Action Segmentation and Detection

Abstract: The ability to identify and temporally segment finegrained human actions throughout a video is crucial for robotics, surveillance, education, and beyond. Typical approaches decouple this problem by first extracting local spatiotemporal features from video frames and then feeding them into a temporal classifier that captures high-level temporal patterns. We introduce a new class of temporal models, which we call Temporal Convolutional Networks (TCNs), that use a hierarchy of temporal convolutions to perform fin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

5
823
0
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 1,251 publications
(913 citation statements)
references
References 30 publications
5
823
0
1
Order By: Relevance
“…Dataset: For Arabic, we use the Arabic Treebank (ATB) dataset: parts 1, 2, and 3 and follow the same data division as (Diab et al, 2013). 3 Lea et al (2017) investigated acausal convolution in their architecture but reported insignificant improvement over causal convolution in their problem space. Figure 1: A dilated acausal convolution with 3 as a filter size and dilation factors equals to 1,2, and 4.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Dataset: For Arabic, we use the Arabic Treebank (ATB) dataset: parts 1, 2, and 3 and follow the same data division as (Diab et al, 2013). 3 Lea et al (2017) investigated acausal convolution in their architecture but reported insignificant improvement over causal convolution in their problem space. Figure 1: A dilated acausal convolution with 3 as a filter size and dilation factors equals to 1,2, and 4.…”
Section: Methodsmentioning
confidence: 99%
“…Convolutional-based architectures utilize hierarchical rather than sequential relationships between the input elements. Tem-poral Convolutional Networks (TCN) is a generic family of architectures that has been developed to alleviate the problem of training deep sequential models and is shown to provide significant improvement over LSTMs across different benchmarks (Bai et al, 2018;Lea et al, 2017). TCNs integrate causal convolutions where output at a certain time is convolved only with elements from earlier times in the previous layers.…”
Section: Introductionmentioning
confidence: 99%
“…According to (10), the ERB and the center frequency are related in a non-linear fashion. The center frequencies of the human auditory filterbank are placed equally distant on the so called ERB scale which is derived by integrating 1/ERB(fc) across frequency [12] resulting in ERB scale (fHz) = 9.265 log(1 + fHz 24.7 × 9.265…”
Section: Auditory Gammatone Filterbank (A-gtf)mentioning
confidence: 99%
“…Although recurrent networks can model the price sequential information, they can hardly extract asset correlations, since they process the price series of each asset separately. Instead, we propose a correlation information net to capture the asset correlation information based on fully convolution operations [33], [42], [63]. Specifically, we devise a new temporal correlational convolution block (TCCB) and use it to construct the correlation information net, as shown in Fig.…”
Section: Correlation Information Netmentioning
confidence: 99%