Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

Cho, Kyunghyun; Merriënboer, Bart van; Gülçehre, Çağlar; Bahdanau, Dzmitry; Bougares, Fethi; Schwenk, Holger; Bengio, Yoshua

doi:10.48550/arxiv.1406.1078

Cited by 2,824 publications

(2,741 citation statements)

References 9 publications

Supporting

Mentioning

2,723

Contrasting

Unclassified

Order By: Relevance

“…Later, variants of RNNs are designed for dealing with the vanishing gradient problem, for example, LSTM, 6 GRU. 7 As the third category, Convolutional Neural Networks (CNNs) are initially designed for two-dimensional data, for example, image processing and have achieved a great success for visual tasks, for example, image classification and object detection. Later on, 1D CNNs are proposed for time series, which keep the parallel training ability of convolutions and the strong learning ability.…”

Section: Deep Neural Networkmentioning

confidence: 99%

Internet traffic prediction with deep neural networks

Jiang

2021

Internet Technology Letters

View full text Add to dashboard Cite

With the evolution of Internet, traffic prediction has been more important than ever, because better resource allocation and network management schemes are based on the precise prediction of future demands. Formulated as a time series prediction problem, different solutions have been proposed, including linear statistical models and non-linear machine learning models. However, there lacks of a comprehensive evaluation of the recently developed deep neural networks for this important problem, which we aim to fill in this letter. Based on an open Internet bandwidth usage dataset collected for 6 months, 13 deep neural networks are evaluated and compared with five baseline models. The experiments demonstrate that all deep neural networks outperform baseline models, in particular among them InceptionTime achieves the lowest prediction error, in terms of RMSE and MAE. As a benchmark for future studies, the dataset, code, and results are publicly available in a Github repository.

show abstract

Section: Deep Neural Networkmentioning

confidence: 99%

Internet traffic prediction with deep neural networks

Jiang

2021

Internet Technology Letters

View full text Add to dashboard Cite

show abstract

“…Since the seminal work of [19] was proposed, the task of VQA has attracted much research attention.The current VQA framework is mainly composed of a question feature extractor, an image feature extractor, and multi-modal fusion. The question feature extraction usually uses Long Short-Term Memory(LSTM) [20], Gated Recurrent Units (GRU) [21], and Skip-thought vectors [22]. The mainstream image feature extraction method is to use Faster R-CNN [23] instead of the traditional CNN, so that the task is connected with the object detection to focus on the salient regions of the image related to the question [24].…”

Section: A Visual Question Answeringmentioning

confidence: 99%

“…Words are represented by 300-dimensional GloVe word embedding D = {w 1 , w 2 ...w n } ∈ R d h * n , d h = 300 denotes the dimension of each word representation. Finally, the word vectors are fed to the Gated Recurrent Units (GRU) [21] network to encode the question embedding Q = {q 1 , q 2 ...q n } ∈ R ds * n , d s = 1024 is the dimension of each hidden state in GRU.…”

Section: A Feature Extractionmentioning

confidence: 99%

MuVAM: A Multi-View Attention-based Model for Medical Visual Question Answering

Pan,

He,

Zhang

et al. 2021

Preprint

View full text Add to dashboard Cite

Medical V isual Question Answering (VQA) is a multi-modal challenging task widely considered by research communities of the computer vision and natural language processing. Since most current medical VQA models focus on visual content, ignoring the importance of text, this paper proposes a multi-v iew attentionbased model(MuVAM) for medical visual question answering which integrates the high-level semantics of medical images on the basis of text description. Firstly, different methods are utilized to extract the features of the image and the question for the two modalities of vision and text. Secondly, this paper proposes a multi-view attention mechanism that include Image-to-Question (I2Q) attention and W ord-to-T ext (W2T) attention. Multi-view attention can correlate the question with image and word in order to better analyze the question and get an accurate answer. Thirdly, a composite loss is presented to predict the answer accurately after multi-modal feature fusion and improve the similarity between visual and textual cross-modal features. It consists of classification loss and image-question complementary (IQC) loss. Finally, for data errors and missing labels in the VQA-RAD dataset, we collaborate with medical experts to correct and complete this dataset and then construct an enhanced dataset, VQA-RAD Ph . The experiments on these two datasets show that the effectiveness of MuVAM surpasses the state-of-the-art method.

show abstract

“…a diagnosis code) from the cooccurrence information without considering the temporal sequential nature of EHR data. Furthermore, considering both long-term dependency and sequential information, recurrent neural networks [13], [14], [15], [16], including LSTM [18] and GRU [19], are used to learn the contextualized representation of EHR data. However, even predictive systems based on these algorithms still perform far below human capabilities, and cannot effectively improve care for individual patients.…”

Section: Introductionmentioning

confidence: 99%

MIPO: Mutual Integration of Patient Journey and Medical Ontology for Healthcare Representation Learning

Peng¹,

Long²,

Wang³

et al. 2021

Preprint

View full text Add to dashboard Cite

Healthcare representation learning on the Electronic Health Record (EHR) is seen as crucial for predictive analytics in the medical field. Many natural language processing techniques, such as word2vec, RNN and self-attention, have been adapted for use in hierarchical and time stamped EHR data, but fail when they lack either general or task-specific data. Hence, some recent works train healthcare representations by incorporating medical ontology (a.k.a. knowledge graph), by self-supervised tasks like diagnosis prediction, but (1) the small-scale, monotonous ontology is insufficient for robust learning, and (2) critical contexts or dependencies underlying patient journeys are never exploited to enhance ontology learning. To address this, we propose an end-to-end robust Transformer-based solution, Mutual Integration of patient journey and Medical Ontology (MIMO) for healthcare representation learning and predictive analytics. Specifically, it consists of task-specific representation learning and graph-embedding modules to learn both patient journey and medical ontology interactively. Consequently, this creates a mutual integration to benefit both healthcare representation learning and medical ontology embedding. Moreover, such integration is achieved by a joint training of both task-specific predictive and ontology-based disease typing tasks based on fused embeddings of the two modules. Experiments conducted on two real-world diagnosis prediction datasets show that, our healthcare representation model MIMO not only achieves better predictive results than previous state-of-the-art approaches regardless of sufficient or insufficient training data, but also derives more interpretable embeddings of diagnoses.

show abstract

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

Cited by 2,824 publications

References 9 publications

Internet traffic prediction with deep neural networks

Internet traffic prediction with deep neural networks

MuVAM: A Multi-View Attention-based Model for Medical Visual Question Answering

MIPO: Mutual Integration of Patient Journey and Medical Ontology for Healthcare Representation Learning

Contact Info

Product

Resources

About