2013
DOI: 10.1109/jproc.2012.2237151
|View full text |Cite
|
Sign up to set email alerts
|

Spoken Language Recognition: From Fundamentals to Practice

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
143
0
1

Year Published

2013
2013
2021
2021

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 247 publications
(144 citation statements)
references
References 90 publications
0
143
0
1
Order By: Relevance
“…In this study we treat foreign accent recognition as a language recognition task typically accomplished via either acoustic or phonotactic modeling [5]. In the former approach, acoustic features, such as shifted delta cepstra This work was partially supported by Academy of Finland (projects 253000 and 253120).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In this study we treat foreign accent recognition as a language recognition task typically accomplished via either acoustic or phonotactic modeling [5]. In the former approach, acoustic features, such as shifted delta cepstra This work was partially supported by Academy of Finland (projects 253000 and 253120).…”
Section: Introductionmentioning
confidence: 99%
“…However, knowledge based modeling, such as phonotactic features, are known to be linguistically and phonetically relevant [5]. However, the front-end of the phonotactic system needs a tokenizer that will turn the utterance into a sequence of "phonetic letters" [15,16].…”
Section: Introductionmentioning
confidence: 99%
“…In most cases, communication is limited on the robot side due to limited sensing and perception capabilities, and this is the original motivation of human-robot collaboration. Technologies such as intention estimation [5], speech recognition [9], human detection and tracking [10,11], and gesture recognition [12,13] have been developed to enhance the perception capability of the robot. Augmented reality has been studied to make the human being and robot share the same reference frame [1].…”
Section: Communication and Interfacementioning
confidence: 99%
“…multilingual speech processing systems) or human listeners (i.e. call routing to a proper human operator) [7]. Therefore, accurate and efficient behaviour in real-time applications is often essential, for example, when used for emergency call routing, where the response time of a fluent native operator is critical [1] [8].…”
Section: Introductionmentioning
confidence: 99%
“…Driven by recent developments in speaker verification, the current state-of-the-art in acoustic LID systems involves using i-vector front-end features followed by diverse classification mechanisms that compensate speaker and session variabilities [7] [10] [11]. The i-vector is a compact representation (typically from 400 to 600 dimensions) of a whole utterance, derived as a point estimate of the latent variables in a factor analysis model [12] [13].…”
Section: Introductionmentioning
confidence: 99%