2016
DOI: 10.1109/taslp.2015.2489558
|View full text |Cite
|
Sign up to set email alerts
|

i-Vector Modeling of Speech Attributes for Automatic Foreign Accent Recognition

Abstract: We propose a unified approach to automatic foreign accent recognition. It takes advantage of recent technology advances in both linguistics and acoustics based modeling techniques in automatic speech recognition (ASR) while overcoming the issue of a lack of a large set of transcribed data often required in designing state-of-the-art ASR systems. The key idea lies in defining a common set of fundamental units "universally" across all spoken accents such that any given spoken utterance can be transcribed with th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 31 publications
(31 citation statements)
references
References 36 publications
(62 reference statements)
0
31
0
Order By: Relevance
“…In this regard, the use of a simple approach based solely on mean and standard deviation of the short time features has shown good performance compared to the more complex BoAW. More complex approaches using contextual information such as ivectors [39] could also be explored. However, the particular context (continuous, smartphone-based monitoring with low battery consumption) would not benefit from this approach.…”
Section: Discussionmentioning
confidence: 99%
“…In this regard, the use of a simple approach based solely on mean and standard deviation of the short time features has shown good performance compared to the more complex BoAW. More complex approaches using contextual information such as ivectors [39] could also be explored. However, the particular context (continuous, smartphone-based monitoring with low battery consumption) would not benefit from this approach.…”
Section: Discussionmentioning
confidence: 99%
“…current version of GLAFF-IT counts 37, 320 lemmas for 457, 702 wordforms and includes nouns, verbs, adjectives and adverbs. 2 Each entry of the lexicon includes a wordform, a tag in MULTEXT-GRACE format [13] specifying the main syntactic category and inflection features, a lemma and API phonological transcriptions with the stress placement when present in GLAW-IT. An extract of GLAFF-IT is reported in Figure 2.…”
Section: From Glaw-it To Glaff-itmentioning
confidence: 99%
“…We use a model that has learned the phonological contexts for stressed and unstressed Italian vowels. Orthographic and phonological context-based approaches have been extensively used in the text-to-speech domain for stress detection [2,7] and for accenting unknown words in a specialised language [18]. The rationale behind our approach is that the exploitation of the phonological neighbourhood of a vowel helps estimate its probability of being stressed or unstressed.…”
Section: Machine Learningmentioning
confidence: 99%
“…Approaches employed so far in the literature for L1 classification include i-vector modelling [14], GMMs trained on MFCCs [15], and prosodic [16] approaches, with varying degrees of success. The approach investigated in this paper predicts, from recordings of spontaneous speech, the speaker's native language (L1) from among 21 different languages and, in the case of Spanish speakers, their country of origin from among three countries.…”
Section: Introductionmentioning
confidence: 99%