Temporal Convolutional Networks for Action Segmentation and Detection

Lea, Colin; Flynn, M. D.; Vidal, René; Reiter, Austin; Hager, Gregory D.

doi:10.1109/cvpr.2017.113

Cited by 1,251 publications

(913 citation statements)

References 30 publications

Supporting

Mentioning

823

Contrasting

Unclassified

Order By: Relevance

“…Dataset: For Arabic, we use the Arabic Treebank (ATB) dataset: parts 1, 2, and 3 and follow the same data division as (Diab et al, 2013). 3 Lea et al (2017) investigated acausal convolution in their architecture but reported insignificant improvement over causal convolution in their problem space. Figure 1: A dilated acausal convolution with 3 as a filter size and dilation factors equals to 1,2, and 4.…”

Section: Methodsmentioning

confidence: 99%

“…Convolutional-based architectures utilize hierarchical rather than sequential relationships between the input elements. Tem-poral Convolutional Networks (TCN) is a generic family of architectures that has been developed to alleviate the problem of training deep sequential models and is shown to provide significant improvement over LSTMs across different benchmarks (Bai et al, 2018;Lea et al, 2017). TCNs integrate causal convolutions where output at a certain time is convolved only with elements from earlier times in the previous layers.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Efficient Convolutional Neural Networks for Diacritic Restoration

Alqahtani¹,

Mishra²,

Diab³

2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

Diacritic restoration has gained importance with the growing need for machines to understand written texts. The task is typically modeled as a sequence labeling problem and currently Bidirectional Long Short Term Memory (BiLSTM) models provide state-of-the-art results. Recently, Bai et al. (2018) show the advantages of Temporal Convolutional Neural Networks (TCN) over Recurrent Neural Networks (RNN) for sequence modeling in terms of performance and computational resources. As diacritic restoration benefits from both previous as well as subsequent timesteps, we further apply and evaluate a variant of TCN, Acausal TCN (A-TCN), which incorporates context from both directions (previous and future) rather than strictly incorporating previous context as in the case of TCN. A-TCN yields significant improvement over TCN for diacritization in three different languages: Arabic, Yoruba, and Vietnamese. Furthermore, A-TCN and BiLSTM have comparable performance, making A-TCN an efficient alternative over BiLSTM since convolutions can be trained in parallel. A-TCN is significantly faster than BiLSTM at inference time (270%∼334% improvement in the amount of text diacritized per minute).1 With the exception of religious scripture and educational books for children, which are always diacritized.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Efficient Convolutional Neural Networks for Diacritic Restoration

Alqahtani¹,

Mishra²,

Diab³

2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

show abstract

“…According to (10), the ERB and the center frequency are related in a non-linear fashion. The center frequencies of the human auditory filterbank are placed equally distant on the so called ERB scale which is derived by integrating 1/ERB(fc) across frequency [12] resulting in ERB scale (fHz) = 9.265 log(1 + fHz 24.7 × 9.265…”

Section: Auditory Gammatone Filterbank (A-gtf)mentioning

confidence: 99%

A Multi-Phase Gammatone Filterbank for Speech Separation Via Tasnet

Ditter

Gerkmann

2020

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

In this work, we investigate if the learned encoder of the end-to-end convolutional time domain audio separation network (Conv-TasNet) is the key to its recent success, or if the encoder can just as well be replaced by a deterministic hand-crafted filterbank. Motivated by the resemblance of the trained encoder of Conv-TasNet to auditory filterbanks, we propose to employ a deterministic gammatone filterbank. In contrast to a common gammatone filterbank, our filters are restricted to 2 ms length to allow for low-latency processing. Inspired by the encoder learned by Conv-TasNet, in addition to the logarithmically spaced filters, the proposed filterbank holds multiple gammatone filters at the same center frequency with varying phase shifts. We show that replacing the learned encoder with our proposed multi-phase gammatone filterbank (MP-GTF) even leads to a scale-invariant source-to-noise ratio (SI-SNR) improvement of 0.8 dB. Furthermore, in contrast to using the learned encoder we show that the number of filters can be reduced from 512 to 128 without loss of performance.

show abstract

“…Although recurrent networks can model the price sequential information, they can hardly extract asset correlations, since they process the price series of each asset separately. Instead, we propose a correlation information net to capture the asset correlation information based on fully convolution operations [33], [42], [63]. Specifically, we devise a new temporal correlational convolution block (TCCB) and use it to construct the correlation information net, as shown in Fig.…”

Section: Correlation Information Netmentioning

confidence: 99%

Cost-Sensitive Portfolio Selection via Deep Reinforcement Learning

Zhang

Zhao

Бин

et al. 2020

IEEE Trans. Knowl. Data Eng.

View full text Add to dashboard Cite

Portfolio Selection is an important real-world financial task and has attracted extensive attention in artificial intelligence communities. This task, however, has two main difficulties: (i) the non-stationary price series and complex asset correlations make the learning of feature representation very hard; (ii) the practicality principle in financial markets requires controlling both transaction and risk costs. Most existing methods adopt handcraft features and/or consider no constraints for the costs, which may make them perform unsatisfactorily and fail to control both costs in practice. In this paper, we propose a cost-sensitive portfolio selection method with deep reinforcement learning. Specifically, a novel two-stream portfolio policy network is devised to extract both price series patterns and asset correlations, while a new cost-sensitive reward function is developed to maximize the accumulated return and constrain both costs via reinforcement learning. We theoretically analyze the near-optimality of the proposed reward, which shows that the growth rate of the policy regarding this reward function can approach the theoretical optimum. We also empirically evaluate the proposed method on real-world datasets. Promising results demonstrate the effectiveness and superiority of the proposed method in terms of profitability, cost-sensitivity and representation abilities.

show abstract

Temporal Convolutional Networks for Action Segmentation and Detection

Cited by 1,251 publications

References 30 publications

Efficient Convolutional Neural Networks for Diacritic Restoration

Efficient Convolutional Neural Networks for Diacritic Restoration

A Multi-Phase Gammatone Filterbank for Speech Separation Via Tasnet

Cost-Sensitive Portfolio Selection via Deep Reinforcement Learning

Contact Info

Product

Resources

About