Proceedings of the 12th International Workshop on Semantic Evaluation 2018
DOI: 10.18653/v1/s18-1081
|View full text |Cite
|
Sign up to set email alerts
|

LIS at SemEval-2018 Task 2: Mixing Word Embeddings and Bag of Features for Multilingual Emoji Prediction

Abstract: In this paper we present the system submitted to the SemEval2018 task2 : Multilingual Emoji Prediction. Our system approaches both languages as being equal by first; considering word embeddings associated to automatically computed features of different types, then by applying bagging algorithm RandomForest to predict the emoji of a tweet.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2018
2018
2019
2019

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 10 publications
0
2
0
Order By: Relevance
“…In various social network or mobile interfaces, you need to think of that ( ) is the "well done and victory" symbol rather than the "well done" symbol. With the help of distributed word and emoji embeddings on a same vector space, training set of tweets relates to a very limited number of specific emojis, our proposed model simplify words in the test set to map to the similar emoji as in training set [2,7,8]. This allows building a precise classifier mapping from tweets or posts to emojis.…”
Section: Models' Methodologymentioning
confidence: 99%
See 1 more Smart Citation
“…In various social network or mobile interfaces, you need to think of that ( ) is the "well done and victory" symbol rather than the "well done" symbol. With the help of distributed word and emoji embeddings on a same vector space, training set of tweets relates to a very limited number of specific emojis, our proposed model simplify words in the test set to map to the similar emoji as in training set [2,7,8]. This allows building a precise classifier mapping from tweets or posts to emojis.…”
Section: Models' Methodologymentioning
confidence: 99%
“…Twitter has abundant guidelines and frequency limits forced on its APIs, and for this purpose it obliges that all users must register an account and provide confirmation details when they query the API. One of the preeminent choices of connecting to Twitter Streaming API and downloading the data is by using R, Python using the libraries TwitteR, Tweepy [2,8]. These APIs exports data in JSON format, the sample data of the dataset is shown below.…”
Section: Datasetmentioning
confidence: 99%