Although creaky voice, or vocal fry, is widely studied phonation mode, open questions still exist in creak's acoustic characterization and automatic recognition. Many questions are open since creak varies significantly depending on conversational context. In this study, we introduce an exploratory creak recognizer based on convolutional neural network (CNN), which is generated specifically for emergency calls. The study focuses on recognition of creaky voice from authentic emergency calls because creak detection could potentially provide information about the caller's emotional state or attempt of voice disguise. We generated the CNN recognition system using emergency call recordings and other out-of-domain speech recordings and compared the results with an already existing and widely used creaky voice detection system: using poor quality emergency call recordings as test data, this system achieved F1 of 0.41 whereas our CNN system accomplished an F1 of 0.64. The results show that the CNN system can perform moderately well using a limited amount of training data on challenging testing data and has the potential to achieve higher F scores when more emergency calls are used for model training.
Speech prosody has been applied in numerous speech emotion recognition tasks. Yet, especially in forensic speech science, a need for acoustic-phonetic analyses with human evaluation still exists since many current speech emotion models are trained with speech data wherein emotions are considered as constant states and the dynamic effects of the interlocutor have been disregarded; for instance, during an emergency call, the caller’s emotional prosody varies according to the communication with the emergency operator, which causes problems for existing speech emotion models when analysing individual emergency recordings. In this phonetic case study, prosodic variation was investigated in two suicidal emergency calls; eight prosodic features from two adult male callers were analysed before and after hearing the emergency operators’ offer to help. In addition, the existence of a possible linear association between the emergency operator’s and the caller’s prosodic features were evaluated. The results show that caller and operator pitch are negatively correlated (?0.33), and half of callers’ prosodic features vary significantly (p < 0.05) after hearing the offer of help.
The purpose of this article is to measure and evaluate commonly identified, yet rather inconsistent, acoustic correlates of speech under stress from authentic emergency call recordings. In this study, ten different acoustic parameters are measured from manually segmented /i/-vowels and hypotheses based on previous studies are statistically tested for a set of female emergency call recordings. The statistical analyses confirm that in comparison to the neutral speech group, the speech under stress group differs in fundamental frequency, shimmer, harmonicity, Hammarberg index, F1, F2, F3 and formant dispersion, which mostly supports the findings from previous studies. Conversely, jitter and vowel duration do not show any statistical difference between the speech under stress group and the neutral group. Furthermore, the results substantiate that stress recognition using different acoustic parameters is feasible from data sets as small as vowel segments; however, the effect of inter-speaker variation must not be underestimated. In future research, a stress detection model for telephone bandpass limited speech based on the optimal combination of acoustic parameters will be created.
Artikkeli käsittelee rajakarjalaismurteiden ’ja’-, ’tai’- ja ’vaikka’-merkityksisten ainesten kontaktilähtöistä leksikaalista variaatiota. Leksikaalisella variaatiolla tarkoitetaan kaksikieliselle puheelle tyypillistä tilannetta, jossa partikkelien kaltaiset L1-ainekset varioivat samanmerkityksisten L2-ainesten kanssa synkronisesti. Rajakarjalaismurteissa L1-ainekset ovat karjalaan konventionaalistuneita venäläisperäisiä globaaleja kopioita ja L2-ainekset niiden suomesta kopioituja läheisiä merkitysvastineita. Tutkimusaineistona on Raja-Karjalan korpus, joka koostuu noin 119 tunnista siirtokarjalaisten murrehaastatteluja 1960- ja 1970-luvuilta. Tuolloin karjalankieliset informantit olivat asuneet suomenkielisellä alueella yli 20 vuotta. Tutkimusaineistoa tarkastellaan korpusvetoisesti sekä kvantitatiivisesti että kvalitatiivisesti: aiemmissa tutkimuksissa esiin nostettujen kielenulkoisten muuttujien eli kotipaikan, sukupuolen ja iän vaikutusta karjalaisen ja suomesta kopioidun aineksen variointiin testataan epäparametrisilla tilastollisilla menetelmillä. Lisäksi lekseemiparien merkitys-, kategoria- ja funktioeroja tutkitaan aineistoesimerkkien ja yleisimpien kaksisanaisten klusterien avulla. Tutkimuksessa hyödynnetään kontaktilingvistisen koodikopioimisen sekä kaksikielisen puheen partikkeleiden lainautumisen teorioita. Tutkimustulokset osoittavat, että kielenulkoisista muuttujista vain pitäjien välillä esiintyy tilastollisesti merkitseviä eroja lekseemiparien variaatiossa (p < 0.001), mikä selittyy pitäjien maantieteellisellä sijainnilla. Aineistoesimerkit ja klusterit osoittavat lekseemiparien olevan läheisiä merkitysvastineita, mutta niiden funktiot eivät ole täysin identtiset. Ainekset ovat paradigmaattisessa suhteessa, joskaan kaikki suomen partikkeleiden funktiot eivät ole kopioituneet. Yhteenvetona voidaan todeta, että kaksikielisyys, maantiede ja kaksikielisten pragmaattinen prestiisikieli selittävät kontaktilähtöistä kielenmuutosta sosiaalisia muuttujia paremmin. Lisäksi koodikopioimisen teorian terminologia osoittautui toimivaksi kuvaamaan monilähtöistä konjunktiojärjestelmää. Contact-induced lexical variation of ‘and’, ‘or’, and ‘though’ lexemes in Border Karelian dialects This paper studies the contact-induced lexical variation of lexemes with the meaning of ‘and’, ‘or’, and ‘though’ in Border Karelian dialects. Lexical variation is understood as L1 items varying with L2 semantic equivalents in synchronic bilingual speech. In Border Karelian dialects, the aforementioned L1 items are conventionalised global copies of Russian origin and their L2 equivalents are newer copies from Finnish. The research data is taken from the synchronic language corpus of Border Karelia, which consists of dialect interviews from the 1960s and 1970s. At that time, the Karelian-speaking informants had lived in the Finnish speaking area over 20 years. The research position in this study is corpus-driven and the data handling is both quantitative and qualitative. The quantitative approach includes statistical nonparametric tests regarding the possible effect of extra-linguistic variables, i.e. home municipality, sex and age, on the variation between L1 and L2 items. Moreover, the qualitative data handling comprises an analysis of data examples and the most common two-word clusters. The aim is to investigate the quantity and the use of the L1 and L2 items in light of the code-copying framework and the theory of bilingual discourse markers. The results show that home municipality is the only extra-linguistic variable showing statistically significant difference in the comparison of L1 and L2 particle use (p<0.001), a fact that has a largely geographical explanation. Both parts of the particle pairs have similar meanings; however, not all of the functions of the Finnish originals are copied. In conclusion, bilingualism, geography, and pragmatics offer better explanations for contact-induced language change than social variables. In addition, the terminology of the code-copying framework aptly describes the bilingual particle use.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.