2019
DOI: 10.1017/pan.2018.62
|View full text |Cite
|
Sign up to set email alerts
|

Testing the Validity of Automatic Speech Recognition for Political Text Analysis

Abstract: The analysis of political texts from parliamentary speeches, party manifestos, social media, or press releases forms the basis of major and growing fields in political science, not least since advances in “text-as-data” methods have rendered the analysis of large text corpora straightforward. However, a lot of sources of political speech are not regularly transcribed, and their on-demand transcription by humans is prohibitively expensive for research purposes. This class includes political speech in certain le… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
14
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 26 publications
(21 citation statements)
references
References 47 publications
2
14
0
Order By: Relevance
“…from Wordfish and Wordshoal) correlate highly with estimates from models based on human coding suggests that collecting transcriptions of debates without extensive and costly human coding may be sufficient for many research projects. Proksch et al (2019) have recently validated the use of automatic speech recognition systems to transcribe video and audio material for QTA. Based on these findings, future projects can analyze the abundance of available video data from further Council configurations and other time frames by simply feeding it into an automatic speech recognition system (e.g.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…from Wordfish and Wordshoal) correlate highly with estimates from models based on human coding suggests that collecting transcriptions of debates without extensive and costly human coding may be sufficient for many research projects. Proksch et al (2019) have recently validated the use of automatic speech recognition systems to transcribe video and audio material for QTA. Based on these findings, future projects can analyze the abundance of available video data from further Council configurations and other time frames by simply feeding it into an automatic speech recognition system (e.g.…”
Section: Resultsmentioning
confidence: 99%
“…from Wordfish and Wordshoal) correlate highly with estimates from models based on human coding suggests that collecting transcriptions of debates without extensive and costly human coding may be sufficient for many research projects. Proksch et al. (2019) have recently validated the use of automatic speech recognition systems to transcribe video and audio material for QTA.…”
Section: Resultsmentioning
confidence: 99%
“…The literature suggests that these ASR systems can achieve very low error rates (Prabhavalkar et al, 2017). Proksch et al (2019) analyze political speech in EU State of the Union debates and show that using the GCP's auto-generated transcripts for bag-of-words text models is comparable to using the human-annotation. We also find GCP's algorithm to be accurate and suitable for our task.…”
Section: Bernard Sanders)mentioning
confidence: 99%
“…We contribute to the small but rapidly growing political science literature on the analysis of audio, image, and video data (e.g., Dietrich, 2018;Dietrich et al, 2018;Joo and Steinert-Threlkeld, 2018;Knox and Lucas, 2018;Torres, 2018) by conducting an empirical validation study for machine classification of video files. The rich human coding data from the WMP project provide an unusual opportunity to directly compare automated coding with human coding using a large number of video files across a number of variables (see Proksch et al, 2019, for a validation study of automatic speech recognition). Our findings suggest that coding tasks done by student research assistants can be accomplished by machines to a similar degree of accuracy.…”
Section: Introductionmentioning
confidence: 99%
“…Harris (2015), for example, studies how text-as-data approaches can be leveraged to accurately classify names according to demographic characteristics, aiding in studies of racial and ethnic politics and gender and women's studies. And Proksch et al (2019) have recently explored ways to glean meaning from spoken recordings, using automated speech recognition software to convert speeches and other recordings into tractable text-as-data structures. These techniques showcase the ways in which political scientists, working at the intersection of methods and substantive knowledge, are expanding the boundaries of the possible in the quantitative analysis of text.…”
Section: Text Analytical Tools In the Pages Of Political Analysismentioning
confidence: 99%