2020
DOI: 10.1609/icwsm.v14i1.7339
|View full text |Cite
|
Sign up to set email alerts
|

MimicProp: Learning to Incorporate Lexicon Knowledge into Distributed Word Representation for Social Media Analysis

Abstract: Lexicon-based methods and word embeddings are the two widely used approaches for analyzing texts in social media. The choice of an approach can have a significant impact on the reliability of the text analysis. For example, lexicons provide manually curated, domain-specific attributes about a limited set of words, while word embeddings learn to encode some loose semantic interpretations for a much broader set of words. Text analysis can benefit from a representation that offers both the broad coverage of word … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 26 publications
0
4
0
Order By: Relevance
“…While Sentiment Analysis models have been robust in analyzing political and social issues (Elghazaly 2016), they may not suffice for measuring nuanced resonance aspects. These require both sentiment polarity and targets, linking it to Stance Detection in NLP (Küc ¸ük and Can 2020; Yan et al 2020). While there are efforts in stance detection, such as predicting attitudes towards specific topics (Darwish et al 2020), or predicting the target and attitude (Dey et al 2017), framing label prediction poses challenges that arise from similar frames possessing variable sentiments and the scarcity of frame-labeled samples, making traditional ML training problematic.…”
Section: Related Workmentioning
confidence: 99%
“…While Sentiment Analysis models have been robust in analyzing political and social issues (Elghazaly 2016), they may not suffice for measuring nuanced resonance aspects. These require both sentiment polarity and targets, linking it to Stance Detection in NLP (Küc ¸ük and Can 2020; Yan et al 2020). While there are efforts in stance detection, such as predicting attitudes towards specific topics (Darwish et al 2020), or predicting the target and attitude (Dey et al 2017), framing label prediction poses challenges that arise from similar frames possessing variable sentiments and the scarcity of frame-labeled samples, making traditional ML training problematic.…”
Section: Related Workmentioning
confidence: 99%
“…We make use of a publicly available Twitter dataset from a prior study [99]. This dataset contains more than 600k Twitter users that have been identified with liberal or conservative leaning based on their following behaviors [98,99]. From this data, we have identified a total of 3,100 tweets and 2,256 users that are related to "gun rights" or "gun control" discussions.…”
Section: Application Scenario and Datamentioning
confidence: 99%
“…(1) a word2vec [59] embedding trained on a standard Twitter corpus [9], and (2) the attribute-aligned embedding trained with MimicProp algorithm [98] that is optimized for sociolinguistic lexicons. We concatenate the two equal-sized embeddings to generate a 600-dimensional vector for each word.…”
Section: Generating Language Cues Viamentioning
confidence: 99%
“…Text-based embedding learning has been previously employed to understand phenomenon on social media (Alam, Joty, and Imran 2018;Chen, McKeever, and Delany 2019;Yan et al 2020;Bahgat, Wilson, and Magdy 2020), including politics on social media (Oliveira et al 2018;Hemphill and Schöpke-Gonzalez 2020). Some of the existing methods in the literature, which aim at embedding social media users based on the written content, concatenate all the posts of a user as a single document and then train a document level embedding model Pan 2017, 2018;Benton, Arora, and Dredze 2016) 1 .…”
Section: Introductionmentioning
confidence: 99%