Human phoneme recognition depending on speech-intrinsic variability

Meyer, Bernd T.; Jürgens, Tim; Wesker, Thorsten; Brand, Thomas; Kollmeier, Birger

doi:10.1121/1.3493450

Cited by 39 publications

(42 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…3dBifdifferent speakers within the same language are used. Meyer et al [10] for example, found an effect of 2.7 dB for speakers that varied in dialect and accent and an effect of speaking rate, effort or style amounting to 1.4 dB within the same speaker.Itturns out that in general, the telephone version has higher (i.e., worse)S RTst han the broadband version using headphones. This is probably due to aloss of speech information and unilateral presentation when using at elephone [4].…”

Section: Digit Triplets Testmentioning

confidence: 96%

HearCom: Hearing in the Communication Society

Vlaming¹,

Kollmeier²,

Dreschler³

et al. 2011

Acta Acustica united with Acustica

Self Cite

View full text Add to dashboard Cite

Agroup of 28 research partners joined the EU-funded project HearCom with the overall aim to improve hearing communication. One of the main achievements has been the provision of advanced hearing screening tests by telephone and Internet. Next to that it wasaimed to harmonize hearing diagnostic tests for European languages. Forthis the concept of an Auditory Profile wasdefined on which anumber of diagnostic hearing tests were developed in several languages. As hearing problems are also aresult of adverse acoustical circumstances such as for room acoustics and telecom systems, these effects have been studied, modelled and evaluated for hearing impaired persons. In the area of rehabilitation alarge scale comparison study wasperformed on signal enhancement techniques for hearing devices. Both objective and subjective benefits were found for specificlistening conditions in relation to achosen signal processing method. As modern technology may assist on hearing and communication it wasstudied howthe use of automatic speech transcription or the use of handheld communication devices may help people with hearing problems. It is shown that communication benefits can be obtained, butt hat the benefiti sl imited in practice as processing power of today'sh andheld devices is still insufficient. An overview is givenonthe HearCom portal with sections for screening diagnostics, hearing information for the public and professionals, and anew HearCompanion service that provides step-by-step support for the hearing rehabilitation process.

show abstract

Section: Digit Triplets Testmentioning

confidence: 96%

HearCom: Hearing in the Communication Society

Vlaming¹,

Kollmeier²,

Dreschler³

et al. 2011

Acta Acustica united with Acustica

Self Cite

View full text Add to dashboard Cite

show abstract

“…The binomial (or multinomial) model is appropriate for data in each cell of a confusion-count matrix, and statistical estimation of the unknown proportion parameter of the binomial distribution has been studied extensively, e.g., [17]- [21]. Phoneme confusions have been measured in a very large number of studies; see, e.g., recent reviews in [16], [22], [23].…”

Section: Introductionmentioning

confidence: 99%

“…Conventional statistical tests for the significance of an observed difference in PC or MI between test conditions must use the observed variations among individual results to estimate the reliability. Parametric test methods such as ANOVA have been applied, e.g., in [22], [25], [26]. When PC and MI results are close to their upper or lower limits, it is obviously questionable to assume that data follow a Gaussian distribution.…”

Section: Introductionmentioning

confidence: 99%

Bayesian Analysis of Phoneme Confusion Matrices

Leijon

Henter

Dahlquist³

2016

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

This paper presents a parametric Bayesian approach to the statistical analysis of phoneme confusion matrices measured for groups of individual listeners in one or more test conditions. Two different bias problems in conventional estimation of mutual information are analyzed and explained theoretically. Evaluations with synthetic datasets indicate that the proposed Bayesian method can give satisfactory estimates of mutual information and response probabilities, even for phoneme confusion tests using a very small number of test items for each phoneme category. The proposed method can reveal overall differences in performance between two test conditions with better power than conventional Wilcoxon significance tests or conventional confidence intervals. The method can also identify sets of confusion-matrix cells that are credibly different between two test conditions, with better power than a similar approximate frequentist method.

show abstract

“…The OLdenburg LOgatome (OLLO) corpus version 2.0 [27], a large speech corpus freely available for research purposes that holds recordings of 150 logatomes uttered by 50 speakers of both sexes (25 women), was selected as the most appropriate corpus for this investigation's objectives. It consists of a set of 80 logatomes of the consonant-vowel-consonant (CVC) form and a set of 70 vowel-consonant-vowel (VCV) logatomes, where each of these 150 logatomes was uttered three times by 40 German and 10 French speakers in their normal speaking style.…”

Section: Speech Corpusmentioning

confidence: 99%

Effect of phoneme variations on blind reverberation time estimation

Andrijašević

2020

Acta Acust.

View full text Add to dashboard Cite

This study focuses on an unexplored aspect of the performance of algorithms for blind reverberation time (T) estimationon the effect that speech signal's phonetic content has on the value of the estimate of T that is obtained from the reverberant version of that signal. To this end, the performance of three algorithms is assessed on a set of logatome recordings artificially reverberated with room impulse responses from four rooms, with their T 20 value in the [0.18, 0.55] s interval. Analyses of variance showed that the null hypotheses of equal means of estimation errors can be rejected at the significance level of 0.05 for the interaction terms between the factors "vowel", "consonant", and "room", while the results of Tukey's multiple comparison procedure revealed that there are both some similarities in the behaviour of the algorithms and some differences, where the latter are stemming from the differences in the details of algorithms' implementation such as the number of frequency bands and whether T is estimated continuously or only on the selected, the so-called speech decay, segments of the signal.

show abstract

Human phoneme recognition depending on speech-intrinsic variability

Cited by 39 publications

References 36 publications

HearCom: Hearing in the Communication Society

HearCom: Hearing in the Communication Society

Bayesian Analysis of Phoneme Confusion Matrices

Effect of phoneme variations on blind reverberation time estimation

Contact Info

Product

Resources

About