Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017) 2017
DOI: 10.18653/v1/s17-2065
|View full text |Cite
|
Sign up to set email alerts
|

DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison

Abstract: In this paper we present a deep-learning system that competed at SemEval-2017 Task 6 "#HashtagWars: Learning a Sense of Humor". We participated in Subtask A, in which the goal was, given two Twitter messages, to identify which one is funnier. We propose a Siamese architecture with bidirectional Long Short-Term Memory (LSTM) networks, augmented with an attention mechanism. Our system works on the token-level, leveraging word embeddings trained on a big collection of unlabeled Twitter messages. We ranked 2 nd in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
20
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 18 publications
(22 citation statements)
references
References 13 publications
(21 reference statements)
0
20
0
Order By: Relevance
“…Considering neural network-based systems, LSTMs were used the most, which is expected given the sequential nature of text data. Plain LSTM models alone, using pretrained word embeddings, achieved competitive results, and DataStories (Baziotis et al, 2017) ranked third using a siamese bidirectional LSTM model with attention.…”
Section: System Analysismentioning
confidence: 99%
“…Considering neural network-based systems, LSTMs were used the most, which is expected given the sequential nature of text data. Plain LSTM models alone, using pretrained word embeddings, achieved competitive results, and DataStories (Baziotis et al, 2017) ranked third using a siamese bidirectional LSTM model with attention.…”
Section: System Analysismentioning
confidence: 99%
“…We also compare the results with sentiment specific word embeddings (Tang et al, 2014), where we use Fully connected layers along with attention as the downstream model. For Sem-3 dataset we compare our results with RCNN (Yin et al, 2017) and Siamese network (Baziotis et al, 2017), which were top performing teams in the task. In addition, we separately train each task with same model parameters Figure 3: Plot of how intensity of the tweet changes with the words, signifying the importance of the sequence without multi-task frame work, to observe the improvement due to multi-task and also to access the ability of the model architecture proposed.…”
Section: Resultsmentioning
confidence: 99%
“…The weights we used for the embedding matrix of these models were the same as Baziotis et al (2017b), pre-trained on 330 millions of english tweets messages posted from 12/2012 to 07/2016 with GloVe.…”
Section: Cat-lstm and Bin-lstmmentioning
confidence: 99%
“…For the specific task of emotion classification in textual conversation, Gupta et al (2017) achieved a F1µ score of 0.7134 on the same dataset, using an architecture based on LSTM. For sentiment analysis, other successful approaches also used Bi-LSTM (Baziotis et al, 2017b) as well as transfer learning (Daval-Frerot et al, 2018).…”
Section: Introductionmentioning
confidence: 99%