2013
DOI: 10.1109/lsp.2012.2227312
|View full text |Cite
|
Sign up to set email alerts
|

Shifted-Delta MLP Features for Spoken Language Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
27
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 40 publications
(27 citation statements)
references
References 11 publications
0
27
0
Order By: Relevance
“…Instead of speech based signals [10], [15], we propose text based comments as a new signal for audio LID of the videos. LID for text data has a wide array of applications ranging across Machine Translation for online resources [11] and building linguistic resources from the web [1].…”
Section: Related Researchmentioning
confidence: 99%
“…Instead of speech based signals [10], [15], we propose text based comments as a new signal for audio LID of the videos. LID for text data has a wide array of applications ranging across Machine Translation for online resources [11] and building linguistic resources from the web [1].…”
Section: Related Researchmentioning
confidence: 99%
“…Following the results reported in [14] and [17], where the accuracy of a LID system was improved thanks to the dimensionality reduction of the PLLR features using PCA, for our experiments we also tested different dimensionality reduction techniques such as HLDA [18]. In this case, the dimensionality reduction was applied for the baseline PLLR features as well as for the state-based PLLR features.…”
Section: Dimensionality Reduction Techniquesmentioning
confidence: 99%
“…We apply the windowing concepts from SDC to the PLLR features, obtaining what we call Shifted Delta PLLR Coefficients (SDPC) and then we apply a PCA projection as in [17] because in this case dimensionality reduction is a must with the high dimensionality vectors that we have to manage (for instance, 177 states in the Hungarian recognizer with a SDC 1_5_3 will result in a vector of dimension 708). We compared using first the PCA reduction and then stacking the SDPC or first stacking the SDPC and then applying PCA.…”
Section: Modification Using Sdpc Parametersmentioning
confidence: 99%
“…Then the posterior features were transformed by taking logarithm, PCA transformation, and MVN. Quantitative analysis in [6] has shown that the Log-MLP features are more robust than spectral features, and are suitable for Gaussian modeling.…”
Section: Tokenizer Implementationmentioning
confidence: 99%
“…This framework utilizes a tokenizer to convert both the query examples and the test utterances into posteriorgrams, and matches the query posteriorgrams with the test posteriorgrams using dynamic time warping (DTW), which has been widely used in template-based speech recognition. Posteriorgram representation is believed to be more robust and more informative than spectral features [5,6].…”
Section: Introductionmentioning
confidence: 99%