2022
DOI: 10.1086/715162
|View full text |Cite
|
Sign up to set email alerts
|

Word Embeddings: What Works, What Doesn’t, and How to Tell the Difference for Applied Research

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
46
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 66 publications
(46 citation statements)
references
References 10 publications
0
46
0
Order By: Relevance
“…Our sample of mTurkers were drawn from a variety of English-speaking contexts and instructed to carry out the task, focusing on word use in legal and courtroom settings. While we see little reason ex ante that coders from Kenya specifically would employ different intuitive word mappings from coders of other nationalities, future research in text analysis might consider how validation exercises like the one in Rodriguez and Spirling (2022) might vary from context to context to better understand asymmetries between the context of the written language and the nationalities of coders. a new dataset of almost 10,000 criminal cases from the Kenyan High Court.…”
Section: Discussionmentioning
confidence: 94%
See 1 more Smart Citation
“…Our sample of mTurkers were drawn from a variety of English-speaking contexts and instructed to carry out the task, focusing on word use in legal and courtroom settings. While we see little reason ex ante that coders from Kenya specifically would employ different intuitive word mappings from coders of other nationalities, future research in text analysis might consider how validation exercises like the one in Rodriguez and Spirling (2022) might vary from context to context to better understand asymmetries between the context of the written language and the nationalities of coders. a new dataset of almost 10,000 criminal cases from the Kenyan High Court.…”
Section: Discussionmentioning
confidence: 94%
“…We provide evidence of a coethnic bonus in appeals decisions using 26 In the Appendix, we illustrate that our main results are consistent using different dictionaries and embedding models. In addition, Appendix F presents the word-embeddings validation exercise described in Rodriguez and Spirling (2022). This shows that human coders from mTurk are unable to distinguish between word relations generated by the word embeddings we employ and human-generated word relationships.…”
Section: Discussionmentioning
confidence: 99%
“…There are various parameters in the modeling process that can be changed to identify the best model for a given dataset. For the purpose of our analyses, we followed the recommendations of Pennington et al ( 2014 ) and Rodriguez and Spirling ( 2022 ). To have enough context for each token, we kept a minimum occurrence of five tokens.…”
Section: Methodsmentioning
confidence: 99%
“…Identification and scaling can be implemented through either a dictionary approach (i.e., matching target texts with a list of attribute keywords or another list of texts) or a machine learning approach. Although NLP methods are primarily developed in computational linguistics, they can also serve as robust instruments in social sciences (Rodriguez & Spirling, 2021).…”
Section: Natural Language Processingmentioning
confidence: 99%