2022 26th International Conference on Pattern Recognition (ICPR) 2022
DOI: 10.1109/icpr56361.2022.9956600
|View full text |Cite
|
Sign up to set email alerts
|

Learning Speaker-specific Lip-to-Speech Generation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 49 publications
0
1
0
Order By: Relevance
“…He et al [29] introduced a flow-based non-autoregressive lip-to-speech model named GlowLTS, designed to bypass autoregressive constraints and facilitate faster inference. Varshney et al [30] used framed frame sequences as feature distributions using transformers within an autoencoder context. Their approach yielded marginal improvements, raising the Short-time Objective Intelligibility measure (STOI) measurements from 0.377 to 0.394 and 0.490, respectively, in the Lip2Wav dataset for Chemistry Lectures.…”
Section: Reconstructed Speech With Unconstrained Datasetsmentioning
confidence: 99%
“…He et al [29] introduced a flow-based non-autoregressive lip-to-speech model named GlowLTS, designed to bypass autoregressive constraints and facilitate faster inference. Varshney et al [30] used framed frame sequences as feature distributions using transformers within an autoencoder context. Their approach yielded marginal improvements, raising the Short-time Objective Intelligibility measure (STOI) measurements from 0.377 to 0.394 and 0.490, respectively, in the Lip2Wav dataset for Chemistry Lectures.…”
Section: Reconstructed Speech With Unconstrained Datasetsmentioning
confidence: 99%