2018
DOI: 10.3758/s13428-018-1037-4
|View full text |Cite
|
Sign up to set email alerts
|

Is automatic speech-to-text transcription ready for use in psychological experiments?

Abstract: Verbal responses are a convenient and naturalistic way for participants to provide data in psychological experiments (Salzinger, 1959). However, audio recordings of verbal responses typically require additional processing, such as transcribing the recordings into text, as compared with other behavioral response modalities (e.g. typed responses, button presses, etc.). Further, the transcription process is often tedious and time-intensive, requiring human listeners to manually examine each moment of recorded spe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 24 publications
(11 citation statements)
references
References 30 publications
0
8
0
Order By: Relevance
“…Instead, we intend to provide a proof-of-concept that an ASR can be used to analyze certain aspects of spontaneous speech, allowing for large-scale use of natural speech for research ends. A similar approach has recently been taken by Ziman et al (2018), who showed that an ASR can be used reliably to transcribe speech data from psychological experiments, in their case a verbal recall memory test. In their study, Ziman and colleagues provided the speech context to their speech-to-text engine.…”
Section: Discussionmentioning
confidence: 96%
See 1 more Smart Citation
“…Instead, we intend to provide a proof-of-concept that an ASR can be used to analyze certain aspects of spontaneous speech, allowing for large-scale use of natural speech for research ends. A similar approach has recently been taken by Ziman et al (2018), who showed that an ASR can be used reliably to transcribe speech data from psychological experiments, in their case a verbal recall memory test. In their study, Ziman and colleagues provided the speech context to their speech-to-text engine.…”
Section: Discussionmentioning
confidence: 96%
“…The strength of the correlations might be considered an index of how well a given measure, in the context of spontaneous speech elicitation, is suited to be transcribed by an ASR, or whether it may require manual coding. (For a similar correlational approach to evaluate transcription accuracy, see Ziman et al, 2018.) As we planned to carry out correlations for many measures of interest, we applied a Bonferroni correction (four measures and three questions resulted in a corrected alpha level of 0.05/12 = 0.004).…”
Section: Methodsmentioning
confidence: 99%
“…The use of different NLP features, classifiers, and learning strategies discussed in this study seems promising to develop a system for the real-time detection of reminiscence in everyday conversations in German of older adults. Such a system could leverage audio-to-text software [ 78 ] of advanced methods from automated coding [ 24 ] to automate the transcription of conversations before NLP preprocessing and the computation of machine learning predictions.…”
Section: Discussionmentioning
confidence: 99%
“…Transcripts of the satirical news shows and the liberal and conservative news shows were collected by means of the command-line program youtube-dl (available at: http://ytdl-org.github.io/youtube-dl/), which we used to download automatic captions from YouTube. We used YouTube because previous research has found such automatic speech-to-text transcriptions to be accurate (Ziman et al 2018). In some respects, they may even be more reliable than the original US television subtitles because real-time subtitles can contain typos and are subject to strict character and time restrictions (Szarkowska, Cintas, and Gerber-Morón in press).…”
Section: Collection Of Transcriptsmentioning
confidence: 99%