Neural Machine Translation by Jointly Learning to Align and Translate

Bahdanau, Dzmitry; Cho, Kyunghyun; Bengio, Yoshua

doi:10.48550/arxiv.1409.0473

Cited by 3,479 publications

(4,988 citation statements)

References 15 publications

Supporting

Mentioning

4,453

Contrasting

Unclassified

Order By: Relevance

“…The outputs of our CNN are passed into a bidirectional LSTM (16 hidden units, tanh activation function) network as a recurrent sequence learning model. Then, the LSTM outputs are passed to the attention network [10] to produce an attention score (e t ) for each attended frame (h t ): e t = h t w a , where w a represents attention layer weight matrix. From e t , an importance attention weight (a t ) is computed for each attended frame:…”

Section: Model Architecturementioning

confidence: 99%

B-line Detection in Lung Ultrasound Videos: Cartesian vs Polar Representation

Kerdegari,

Nhat,

McBride

et al. 2021

Preprint

View full text Add to dashboard Cite

Lung ultrasound (LUS) imaging is becoming popular in the intensive care units (ICU) for assessing lung abnormalities such as the appearance of B-line artefacts as a result of severe dengue. These artefacts appear in the LUS images and disappear quickly, making their manual detection very challenging. They also extend radially following the propagation of the sound waves. As a result, we hypothesize that a polar representation may be more adequate for automatic image analysis of these images. This paper presents an attention-based Convolutional+LSTM model to automatically detect B-lines in LUS videos, comparing performance when image data is taken in Cartesian and polar representations. Results indicate that the proposed framework with polar representation achieves competitive performance compared to the Cartesian representation for B-line classification and that attention mechanism can provide better localization.

show abstract

Section: Model Architecturementioning

confidence: 99%

B-line Detection in Lung Ultrasound Videos: Cartesian vs Polar Representation

Kerdegari,

Nhat,

McBride

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The attention mechanism was first proposed in the field of natural language processing (NLP) to align the input sequence and the output sequence [52]. Working with the LSTM-based encoder-decoder architecture, the attention mechanism has revealed great power in capturing long-range dependencies in sequences.…”

Section: B Attention Mechanismmentioning

confidence: 99%

Video Crowd Localization with Multi-focus Gaussian Neighborhood Attention and a Large-Scale Benchmark

Li¹,

Liu²,

Yang³

et al. 2021

Preprint

View full text Add to dashboard Cite

Video crowd localization is a crucial yet challenging task, which aims to estimate exact locations of human heads in the given crowded videos. To model spatial-temporal dependencies of human mobility, we propose a multi-focus Gaussian neighbor attention (GNA), which can effectively exploit long-range correspondences while maintaining the spatial topological structure of the input videos. In particular, our GNA can also capture the scale variation of human heads well using the equipped multifocus mechanism. Based on the multi-focus GNA, we develop a unified neural network called GNANet to accurately locate head centers in video clips by fully aggregating spatial-temporal information via a scene modeling module and a context crossattention module. Moreover, to facilitate future researches in this field, we introduce a large-scale crowded video benchmark named SenseCrowd, which consists of 60K+ frames captured in various surveillance scenarios and 2M+ head annotations. Finally, we conduct extensive experiments on three datasets including our SenseCrowd, and the experiment results show that the proposed method is capable to achieve state-of-the-art performance for both video crowd localization and counting. The code and the dataset will be released.

show abstract

“…For the forecasting, we apply the encoder-decoder model considering the search query data using an attention mechanism [5].…”

Section: Model Structurementioning

confidence: 99%

“…Owing to the rapid development of neural networks, many models have been based on these, particularly convolutional neural networks (CNNs) [38,6,11] and recurrent neural networks (RNNs) [41,40,25], which capture the temporal variation. In recent years, there has been an increase in time-series prediction models using "attention" (Transformer) to achieve state-of-the-art performance in multiple natural language processing applications [5,42]. Attention generally aggregates temporal features using dynamically generated weights, thereby enabling the network to focus on significant time steps in the past directly.…”

Section: B Related Workmentioning

confidence: 99%

“…This approach is based on two aspects: the flu time series exhibits strong seasonality and search query data are useful features for forecasting non-seasonal parts. Specifically, the search query data are used to forecast the deseasonalized component of flu data by leveraging the attention mechanism [5], which is useful for considering the feature importance (Section 4.2). Subsequently, we use the model addressing the task as a base and extend it to the flu forecasting model for multiple countries (Section 4.3).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Single Model for Influenza Forecasting of Multiple Countries by Multi-task Learning

Murayama¹,

Wakamiya²,

Aramaki³

2021

Preprint

View full text Add to dashboard Cite

The accurate forecasting of infectious epidemic diseases such as influenza is a crucial task undertaken by medical institutions. Although numerous flu forecasting methods and models based mainly on historical flu activity data and online user-generated contents have been proposed in previous studies, no flu forecasting model targeting multiple countries using two types of data exists at present. Our paper leverages multi-task learning to tackle the challenge of building one flu forecasting model targeting multiple countries; each country as each task. Also, to develop the flu prediction model with higher performance, we solved two issues; finding suitable search queries, which are part of the usergenerated contents, and how to leverage search queries efficiently in the model creation. For the first issue, we propose the transfer approaches from English to other languages. For the second issue, we propose a novel flu forecasting model that takes advantage of search queries using an attention mechanism and extend the model to a multi-task model for multiple countries' flu forecasts. Experiments on forecasting flu epidemics in five countries demonstrate that our model significantly improved the performance by leveraging the search queries and multi-task learning compared to the baselines.

show abstract

Neural Machine Translation by Jointly Learning to Align and Translate

Cited by 3,479 publications

References 15 publications

B-line Detection in Lung Ultrasound Videos: Cartesian vs Polar Representation

B-line Detection in Lung Ultrasound Videos: Cartesian vs Polar Representation

Video Crowd Localization with Multi-focus Gaussian Neighborhood Attention and a Large-Scale Benchmark

Single Model for Influenza Forecasting of Multiple Countries by Multi-task Learning

Contact Info

Product

Resources

About