2012
DOI: 10.1007/s00530-012-0262-4
|View full text |Cite
|
Sign up to set email alerts
|

Assessing the importance of audio/video synchronization for simultaneous translation of video sequences

Abstract: Lip synchronization is considered a key parameter during interactive communication. In the case of video conferencing and television broadcasting, the differential delay between audio and video should remain below certain thresholds, as recommended by several standardization bodies. However, further research has also shown that these thresholds can be relaxed, depending on the targeted application and use case. In this article, we investigate the influence of lip sync on the ability to perform real-time langua… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 11 publications
(14 citation statements)
references
References 18 publications
0
14
0
Order By: Relevance
“…As authors in Gaston, Boley, Selter, and Ratterman (2010) state, audio artifacts can either enhance or reduce the ability to detect a certain video impairment present in the same sequence. For some types of contents, synchronization between audio and video can be a key factor in the subjective assessment (Staelens, De Meulenaere, et al, 2012). Authors in Kim, Kondoz, and Shi (2013) used the correlation between audio and video as a new metric to predict the QoE, corroborating that audio and video mismatch affects the perceived audio quality.…”
Section: Related Workmentioning
confidence: 98%
“…As authors in Gaston, Boley, Selter, and Ratterman (2010) state, audio artifacts can either enhance or reduce the ability to detect a certain video impairment present in the same sequence. For some types of contents, synchronization between audio and video can be a key factor in the subjective assessment (Staelens, De Meulenaere, et al, 2012). Authors in Kim, Kondoz, and Shi (2013) used the correlation between audio and video as a new metric to predict the QoE, corroborating that audio and video mismatch affects the perceived audio quality.…”
Section: Related Workmentioning
confidence: 98%
“…Similarly, we want the first audio frame of the mixed stream to be composited by the second audio frame of p 2 and the sixth frame of p 1 , but this is impossible. However, according to the ITU-R BT.1359 standard [41]- [43], when the difference T ranges from [−100 ms, 25 ms], it will be acceptable to the human eye. For this reason, we can redefine f O v (k).…”
Section: Implementation In Quasi-peermentioning
confidence: 99%
“…In general, combining the ITU-R BT.1359 standard [41]- [43] can simplify Formula (3) into Formula (7), which can be used to quantitatively evaluate the synchronization of audio and video, where T represents the time difference on a certain standard time; gt a and gt v represent a certain standard time when audio and video were generated and, theoretically, gt a should be equal to gt v ; pt v indicates the actual time when the user sees the picture at that time point during live broadcast; and pt a indicates the time when the user hears the audio at that time during the live broadcast. The larger the T value is, the more out of sync the broadcast is.…”
Section: Synchronizationmentioning
confidence: 99%
“…In video conferencing, streaming and television broadcasting, the uneven delay between audio and video should remain below certain thresholds, recommended by several standardization bodies. However, research shows that the thresholds can be relaxed, depending on the targeted application and use case [21].…”
Section: Measuring the Lip Sync Artefactmentioning
confidence: 99%
“…Regarding detection thresholds, [21] describes the high number of thresholds determined by the authors. Some authors and research groups have concluded that audio may be played up to 305 ms ahead of video and conversely video can be displayed up to 190 ms ahead of the audio.…”
Section: Measuring the Lip Sync Artefactmentioning
confidence: 99%