Movie segmentation into scenes and chapters using locally weighted bag of visual words

Chasanis, Vasileios; Kalogeratos, Argyris; Likas, Aristidis

doi:10.1145/1646396.1646439

“…Such a result is quite competitive with the state of the art techniques introduced in (Rasheed et al, 2005), (Chasanis et al, 2009), (Zhu et al, 2009) which yield precision/recall rates varying between 65% and 72%.…”

Section: Figure 13 [Detected Scenes]mentioning

confidence: 93%

“…The validation of our scene extraction method has been performed on a corpus of 6 sitcoms and 6 Hollywood movies (Tables 6 and 7) also used for evaluation purposes in the state of the art algorithms presented in (Rasheed et al, 2005), (Chasanis et al, 2009), (Zhu et al, 2009). Fig.…”

Section: Scene/dvd Chapter Extractionmentioning

confidence: 99%

Video Segmentation and Structuring for Indexing Applications

Țapu

¹

,

Zaharia

²

2013

Multimedia Data Engineering Applications and Processing

View full text Add to dashboard Cite

This paper introduces a complete framework for temporal video segmentation. First, a computationally efficient shot extraction method is introduced, which adopts the normalized graph partition approach, enriched with a non-linear, multiresolution filtering of the similarity vectors involved. The shot boundary detection technique proposed yields high precision (90%) and recall (95%) rates, for all types of transitions, both abrupt and gradual. Next, for each detected shot, the authors construct a static storyboard by introducing a leap keyframe extraction method. The video abstraction algorithm is 23% faster than existing techniques for similar performances. Finally, the authors propose a shot grouping strategy that iteratively clusters visually similar shots under a set of temporal constraints. Two different types of visual features are exploited: HSV color histograms and interest points. In both cases, the precision and recall rates present average performances of 86%.

show abstract

“…More recent techniques (Chasanis et al, 2009), (Zhu et al, 2009) introduce in the analysis process useful concepts such as temporal constraints and visual similarity.…”

Section: Related Workmentioning

confidence: 99%

Video Segmentation and Structuring for Indexing Applications

Țapu

¹

,

Zaharia

²

2011

International Journal of Multimedia Data Engineering and Management

View full text Add to dashboard Cite

This paper introduces a complete framework for temporal video segmentation. First, a computationally efficient shot extraction method is introduced, which adopts the normalized graph partition approach, enriched with a non-linear, multiresolution filtering of the similarity vectors involved. The shot boundary detection technique proposed yields high precision (90%) and recall (95%) rates, for all types of transitions, both abrupt and gradual. Next, for each detected shot, the authors construct a static storyboard by introducing a leap keyframe extraction method. The video abstraction algorithm is 23% faster than existing techniques for similar performances. Finally, the authors propose a shot grouping strategy that iteratively clusters visually similar shots under a set of temporal constraints. Two different types of visual features are exploited: HSV color histograms and interest points. In both cases, the precision and recall rates present average performances of 86%.

show abstract

“…Concerning the keyframe visual similarity involved in the above-described process, we have considered two different approaches, based on (1) chi-square distance between HSV color histograms, and (2) the number of matched interest points determined based on SIFT descriptors with a Kd-tree matching technique [20].…”

Section: Scene Segmentationmentioning

confidence: 99%

Automatic Multilevel Temporal Video Structuring

Țapu

¹

,

Zaharia

²

2011

2011 IEEE Fifth International Conference on Semantic Computing

0

View full text Add to dashboard Cite

Abstract-In this paper we propose a novel and complete video scene segmentation framework, developed on different structural levels of analysis. Firstly, a shot boundary detection algorithm is introduced that extends the graph partition method with a nonlinear scale space filtering technique which increase the detection efficiency with gains of 7,4% to 9,8% in terms of both precision and recall rates. Secondly, static storyboards are formed based on a leap keyframe extraction method that selects a variable number of keyframes, adapted to the visual content variation, for each detected shot. Finally using the extracted keyframes, spatio-temporal coherent shots are clustered into the same scene based on temporal constraints and with the help of a new concept of neutralized shots. Video scenes are obtained with average precision and recall rates of 86%.

show abstract

Movie segmentation into scenes and chapters using locally weighted bag of visual words

Cited by 28 publications

References 11 publications

Video Segmentation and Structuring for Indexing Applications

Video Segmentation and Structuring for Indexing Applications

Video Segmentation and Structuring for Indexing Applications

Automatic Multilevel Temporal Video Structuring

Contact Info

Product

Resources

About