A two-year experiment on voice identification through visual inspection of spectrograms was performed with the twofold goal of checking Kersta's claims in this matter [Nature 196, 1253[Nature 196, -1257[Nature 196, (1962'] and testing models including variables related to forensic tasks. The 250 speakers used in this experiment were randomly selected from a homogeneous population of 25 000 males speaking general American English, all students at Michigan State University. A total of 34 996 experimental trials of identification were performed by 29 trained examiners. Each trial involved up to 40 known voices, in various conditions: with closed and open trials, contemporary and noncontemporary spectrograms, nine or six clue words spoken in isolation, in a fixed context and in a random context, etc. The examiners were forced to reach a positive decision (identification or elimination) in each instance, taking an average time of 15 minutes. Their decisions were based solely on inspection of spectrograms; listening to the identification by voices was excluded from this experi-
ment. The examiners graded their self-confidence in their judgments on a 4-point scale (1 and 2, uncertain; 3 and 4, certain). Results of this experiment confirmed Kersta's experimental data, which involved only closed trials of contemporary spectrograms and clue words spoken in isolation. Experimental trials of this study, correlated with forensic models (open trials, fixed and random contexts, noncontemporary spectrograms), yielded an error of approximately 6% false identifications and approximately 13% false eliminations. The examiners judged approximately 60% of their wrong answers and 20% of their right answers as "uncertain."This suggests that if the examiners had been able to express no opinion when in doubt, only 74% of the total number of tasks would have had a positive answer, with approximately 2% errors of false identification and 5% errors of false elimination. The different conditions existing between experimental trials of identification or elimination performed by the examiners of this study and the tasks performed by a professional examiner in real cases are discussed.
Seventy-two students, representing equally Hindi, Spamsh, and Japanese speakers, served as experimental subjects. Each group was equally subdivided between students who were more and ones who were less proficient in aural comprehension of English. All participants recorded a philosophical essay in both their native language and in English. The statistical treatment was based upon the median duration of each speaker's distribution of pauses in reading a two-minute portion of the passage. The groups of higher and lower levels of proficiency in the aural reception of English differed significantly both in the median length of pauses and the accompanying semi-interquartile range. Language groups differed in median duration of pauses, but not in semi-interquartile range. There was no difference in the length or distribution of pauses that the students used in reading in their native language and in English.
Vowels were segmented into 15 different temporal segments taken from the middle of the vowel and ranging from 4 to 60 msecs, then presented to 6 subjects with normal hearing. The mean temporal-segment recognition threshold of 15 msecs with a range from 9.3 msecs for the /u/ to 27.2 milliseconds for the /a/. Misidenti-fication of vowels was most often confused with the vowel sound adjacent to it on the vowel-hump diagram. There was no significant difference between the cardinal and noncardinal vowels.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.