2015
DOI: 10.1177/0081175015581378
|View full text |Cite
|
Sign up to set email alerts
|

A Progressive Supervised-learning Approach to Generating Rich Civil Strife Data

Abstract: “Big data” in the form of unstructured text pose challenges and opportunities to social scientists committed to advancing research frontiers. Because machine-based and human-centric approaches to content analysis have different strengths for extracting information from unstructured text, the authors argue for a collaborative, hybrid approach that combines their comparative advantages. The notion of a progressive supervised-learning approach that combines data science techniques and human coders is developed an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
43
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
4
3
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 71 publications
(45 citation statements)
references
References 53 publications
0
43
0
Order By: Relevance
“…While there is considerable effort devoted to developing new algorithms for specific domains and problems (see, e.g., Bamman andSmith 2015, Nardulli, Althaus, andHayes 2015), there is a dearth of empirical research to guide scholars in the selection and application of already established and packaged automated methods, especially with respect to the analysis of complex conceptual content. Can the leading fully-automated approaches to content analysis -dictionaries and unsupervised machine learning (UML) -circumvent the need for hand coding altogether?…”
Section: Introductionmentioning
confidence: 99%
“…While there is considerable effort devoted to developing new algorithms for specific domains and problems (see, e.g., Bamman andSmith 2015, Nardulli, Althaus, andHayes 2015), there is a dearth of empirical research to guide scholars in the selection and application of already established and packaged automated methods, especially with respect to the analysis of complex conceptual content. Can the leading fully-automated approaches to content analysis -dictionaries and unsupervised machine learning (UML) -circumvent the need for hand coding altogether?…”
Section: Introductionmentioning
confidence: 99%
“…Machine-assisted approaches to political event data have been in use for nearly 30 years, since the inception of the Kansas Event Data System (KEDS) (Schrodt and Gerner 1994). More recently, there have been several approaches which incorporate machine learning methods into their pipelines (Croicu and Weidmann 2015, Marakov et al 2015, Nardulli, Althaus and Hayes 2015, Wueest, Rothenhäusler and Hutter 2013. MPEDS differs from other automated event data projects because it focuses on coding for protest events rather than for a wider range of political events and because it aims to collect rich information about each event.…”
Section: Methodsmentioning
confidence: 99%
“…The same applies to China (Steinhardt 2016), and other Asian nations. On the other hand, although cross-national datasets (e.g., Banks and Wilson 2017;Jenkins et al 2012;Leetaru and Schrodt 2013;Nardulli, Althaus, and Hayes 2015;Raleigh et al 2010) sometimes include Myanmar, their intransparent sources or sole coding of international news outlets means they are of limited use, particularly when analysing single countries at critical junc-tures, such as in crisis or regime change, and when micro-foundations matter (Nam 2006b). Even though cross-sectional datasets differ in data quality, and promising new data has recently been presented (Weidmann and Espen forthcoming), cross-sectional datasets -by definition -provide more breath than depth.…”
Section: The Motivation For Protest Data On Myanmarmentioning
confidence: 99%