Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology 2022
DOI: 10.18653/v1/2022.sigmorphon-1.9
|View full text |Cite
|
Sign up to set email alerts
|

Domain-Informed Probing of wav2vec 2.0 Embeddings for Phonetic Features

Abstract: In recent years large transformer model architectures have become available which provide a novel means of generating high-quality vector representations of speech audio. These transformers make use of an attention mechanism to generate representations enhanced with contextual and positional information from the input sequence. Previous works have explored the capabilities of these models with regard to performance in tasks such as speech recognition and speaker verification, but there has not been a significa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(7 citation statements)
references
References 11 publications
1
4
0
Order By: Relevance
“…The results in the previous sections show that MLP-based classifiers perform adequately on the latent representations for all layers (Table 1, in line with [2,3]), while the internal phonephone structure shows a varying pattern across layers (Figs. 1, 2).…”
Section: Local and Global Structuresupporting
confidence: 57%
See 4 more Smart Citations
“…The results in the previous sections show that MLP-based classifiers perform adequately on the latent representations for all layers (Table 1, in line with [2,3]), while the internal phonephone structure shows a varying pattern across layers (Figs. 1, 2).…”
Section: Local and Global Structuresupporting
confidence: 57%
“…Importantly, this does not mean that those representations contain weaker phonetic information. After all, [2,3,4] have shown that classifiers that use these representations for a range of downstream tasks outperform the results of similar classifiers that operate on conventional spectral representations. The figure shows that the acoustic organisation in the higher transformer layers differs from the lowest transformer layer, in line with the findings in the next sections.…”
Section: Searching For Acoustic-phonetic Structurementioning
confidence: 99%
See 3 more Smart Citations