2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2021
DOI: 10.1109/asru51503.2021.9687890
|View full text |Cite
|
Sign up to set email alerts
|

Colombian Dialect Recognition Based on Information Extracted from Speech and Text Signals

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 15 publications
0
2
0
Order By: Relevance
“…The recent application of the transformer-based wav2vec 2.0 showcased its utility in developing speech-based age and gender prediction models, including cross-corpus evaluation, with significant improvements in recall compared to a classic modeling approach based on hand-crafted features [47]. Additionally, wav2vec 2.0 representations of speech were found to be more effective in distinguishing between PD and HC subjects compared to language representations, including word-embedding models [48]. As a pre-trained model, wav2vec shares the advantage with TRILLsson and x-vectors of being directly applicable without the need for further training, addressing the data-hungry nature common to many neural networks in the field.…”
Section: Related Workmentioning
confidence: 99%
“…The recent application of the transformer-based wav2vec 2.0 showcased its utility in developing speech-based age and gender prediction models, including cross-corpus evaluation, with significant improvements in recall compared to a classic modeling approach based on hand-crafted features [47]. Additionally, wav2vec 2.0 representations of speech were found to be more effective in distinguishing between PD and HC subjects compared to language representations, including word-embedding models [48]. As a pre-trained model, wav2vec shares the advantage with TRILLsson and x-vectors of being directly applicable without the need for further training, addressing the data-hungry nature common to many neural networks in the field.…”
Section: Related Workmentioning
confidence: 99%
“…Deep learning models offer more promise than conventional ones. We'll test this further [20]. After then, the sounds are transformed into text using the speech-to-text module of IBM Watson.…”
Section: Literature Reviewmentioning
confidence: 99%