The effects of ingesting ethanol have been shown to be somewhat variable in humans. To date, there appear to be but few universals. Yet, the question often arises: is it possible to determine if a person is intoxicated by observing them in some manner? A closely related question is: can speech be used for this purpose and, if so, can the degree of intoxication be determined? One of the many issues associated with these questions involves the relationships between a person's paralinguistic characteristics and the presence and level of inebriation. To this end, young, healthy speakers of both sexes were carefully selected and sorted into roughly equal groups of light, moderate, and heavy drinkers. They were asked to produce four types of utterances during a learning phase, when sober and at four strictly controlled levels of intoxication (three ascending and one descending). The primary motor speech measures employed were speaking fundamental frequency, speech intensity, speaking rate and nonfluencies. Several statistically significant changes were found for increasing intoxication; the primary ones included rises in F0, in task duration and for nonfluencies. Minor gender differences were found but they lacked statistical significance. So did the small differences among the drinking category subgroups and the subject groupings related to levels of perceived intoxication. Finally, although it may be concluded that certain changes in speech suprasegmentals will occur as a function of increasing intoxication, these patterns cannot be viewed as universal since a few subjects (about 20%) exhibited no (or negative) changes.
This study was designed to evaluate commonly used voice stress analyzers--in this case the layered voice analysis (LVA) system. The research protocol involved the use of a speech database containing materials recorded while highly controlled deception and stress levels were systematically varied. Subjects were 24 each males/females (age range 18-63 years) drawn from a diverse population. All held strong views about some issue; they were required to make intense contradictory statements while believing that they would be heard/seen by peers. The LVA system was then evaluated by means of a double blind study using two types of examiners: a pair of scientists trained and certified by the manufacturer in the proper use of the system and two highly experienced LVA instructors provided by this same firm. The results showed that the "true positive" (or hit) rates for all examiners averaged near chance (42-56%) for all conditions, types of materials (e.g., stress vs. unstressed, truth vs. deception), and examiners (scientists vs. manufacturers). Most importantly, the false positive rate was very high, ranging from 40% to 65%. Sensitivity statistics confirmed that the LVA system operated at about chance levels in the detection of truth, deception, and the presence of high and low vocal stress states.
This paper is the second of a series; the first has been published (J Forensic Sci, 1998;43:1153–62). The goal in the initial pair of experiments was to determine if speakers (actors) could effectively mimic the speech of intoxicated individuals and also volitionally reduce the degradation to their speech that resulted from severe inebriation. To this end, two highly controlled experiments involving 12 actor-speakers were carried out. It was found that, even when sober, nearly all of them were judged drunker (when pretending) than when they actually were severely intoxicated. In the second experiment, they tried to sound sober when highly intoxicated; here most were judged less inebriated than they were. The goal of this second paper is to identify some of the speech characteristics that allowed the subjects to achieve the cited illusions. The focus here is on four paralinguistic factors: fundamental frequency (F0), speaking rate, vocal intensity, and nonfluency level. For the simulation of intoxication study, it was found that F0 was raised along with increased intoxication but raised even more when this state was feigned. A slowing of speaking rate was associated with increasing intoxication, but this shift also was greater when the speaker simulated intoxication. The most striking contrast was found for the nonfluencies; they were doubled for actual intoxication, but quadrupled when intoxication was simulated. On the other hand, the shifts exhibited by the subjects when they attempted to sound sober were not as clear cut. Indeed, no systematic relationships were found here for either F0 or vocal intensity. Both speaking rate and the number of nonfluencies shifted appropriately, but these changes were not statistically significant. In sum, discernable suprasegmental relationships occurred for both studies (but especially the first); further, it is predicted that useful cues also will be found embedded in the segmentals (the sounds of speech).
The purpose of this study was to evaluate a commonly used voice stress analyzer, the National Institute of Truth Verification's (NITV) Computer Voice Stress Analyzer (CVSA), using a speech database containing materials recorded (i) in the laboratory, while highly controlled deceptive and shock-induced stress levels were systematically varied, and (ii) during a field procedure. Subjects were 24 each males/females (age range 18-63 years) drawn from a representative population. All held strong views on an issue and were required to make sharply derogatory statements about it. The CVSA system was then evaluated in a double-blind study using three sets of examiners: (i) two UF scientists trained/certified by NITV in CVSA operation, (ii) three experienced NITV operators provided by the manufacturer and (iii) five experimental phoneticians. The results showed that the "true positive" (or hit) rates for all examiners ranged from chance to somewhat higher levels (c. 50-65%) for all conditions and types of materials (e.g., stress vs. unstressed, truth vs. deception). However, the false-positive rate was just as high - often higher. Sensitivity statistics demonstrated that the CVSA system operated at about chance level.
Two groups of subjects were administered controlled doses of alcohol while breath alcohol concentration (BrAC) measurements were made at regular intervals. They were recorded reading a 30-s passage when they reached preset BrAC windows. Fundamental frequency measurements were calculated and compared for sober (0.00 BrAC) and intoxicated (0.12 BrAC) productions. The number of misarticulations occurring during the readings also were assessed. In the first study, subjects were grouped on the basis of whether they were rated as sounding intoxicated at 0.12 BrAC (ratings were performed by 50 auditors using a 5-pt. scale). Subjects who sounded intoxicated were placed in one group, while those that did not were placed in a second. The first group showed a consistent, but statistically nonsignificant decrease in F0 as a result of intoxication; group 2’s F0 changes were not consistent. In addition, the first group showed a higher mean increase in misarticulations than did group 2. The second population was grouped by drinking level (heavy, medium, or light); none showed a statistically significant change in F0. Moreover, misarticulations increased (nonsignificantly) as drinking level increased. The results will be correlated with data from other studies. [Research supported by NIH.]
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.