2019
DOI: 10.1111/bjop.12416
|View full text |Cite
|
Sign up to set email alerts
|

‘Please sort these voice recordings into 2 identities’: Effects of task instructions on performance in voice sorting studies

Abstract: We investigated the effects of two types of task instructions on performance of a voice sorting task by listeners who were either familiar or unfamiliar with the voices. Listeners were asked to sort 15 naturally varying stimuli from two voice identities into perceived identities. Half of the listeners were to sort the recordings freely into as many identities as they perceived; the other half were forced to sort stimuli into two identities only. As reported in previous studies, unfamiliar listeners formed more… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
19
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 16 publications
(20 citation statements)
references
References 29 publications
(43 reference statements)
0
19
0
Order By: Relevance
“…Furthermore, we computed an index of each participant’s ability of “telling people together” and telling people apart. These indices were computed in the same way as described for other voice sorting tasks (see Lavan, Burston, & Garrido, 2019 ; Lavan, Burston, Ladwa, et al, 2019 ; Lavan, Merriman, et al, 2019 ). In brief, we created 30 × 30 item-wise response matrices for each participant (catch items were excluded), which are symmetrical around the diagonal.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Furthermore, we computed an index of each participant’s ability of “telling people together” and telling people apart. These indices were computed in the same way as described for other voice sorting tasks (see Lavan, Burston, & Garrido, 2019 ; Lavan, Burston, Ladwa, et al, 2019 ; Lavan, Merriman, et al, 2019 ). In brief, we created 30 × 30 item-wise response matrices for each participant (catch items were excluded), which are symmetrical around the diagonal.…”
Section: Resultsmentioning
confidence: 99%
“…The intensity of all stimuli was root-mean-square normalised to 67.7 dB using Praat ( Boersma & Weenink, 2013 ). These stimuli were then added to a PowerPoint slide, represented by numbered boxes (see Lavan, Burston, & Garrido, 2019 ; Lavan, Burston, Ladwa, et al, 2019 ; Lavan, Merriman, et al, 2019 ).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…This within‐person variability is not restricted to the visual domain but is also a prominent feature in human voices, such that the same person’s voice can sound dramatically different from situation to situation (e.g., shouting over background noise, singing, or laughing; Lavan, Burton, Scott & McGettigan, 2019). Within‐person variability has been shown to dramatically affect perceptual judgements of voice identity when listeners were not familiar with a voice: In a series of voice sorting studies, unfamiliar listeners were unable to accurately perceive speaker identity from naturally varying voice recordings (i.e., excerpts taken from across a television series; Lavan, Burston & Garrido, 2019; Lavan, Burston, Ladwa, Merriman, Knight & McGettian, 2019; Lavan, Merriman, Ladwa, Burston, Knight & McGettigan, 2019). Specifically, they perceived variable recordings of the same voice identity as a number of different people, thus misinterpreting within‐person variability as between‐person variability.…”
Section: Introductionmentioning
confidence: 99%
“…An important task within the NLP area is that of automated translation, which usually involves feature analysis (based on neural networks as well as Markov chains, for which software tools such as HTK or SPHINX are available [22]), unit detection (such as phonemes, [23]), syntactic analysis (for example, for word validation [24]), and proper translation via a looking-up procedure in a database. Remarkably, to the best of our knowledge, voice-to-voice translation by direct learning is not present in the literature with few exceptions for classification [25][26][27][28] and some others for translation: [2], which consists in a neural network composed by an 8-layer LSTM, an attention model for phonemes recognition and a voice synthesis module and [3] which uses a codebook of phonemes for translation with three stages, firstly there are some BiLSTM-layers to encoding the input data into phonemes, a convolutional middle layers and a final LSTM-layers for decoding the phonemes into an audio signal.…”
Section: Introductionmentioning
confidence: 99%