Abstract:Movies segmentation into semantically correlated units is a quite tedious task due to "semantic gap". Low-level features do not provide useful information about the semantical correlation between shots and usually fail to detect scenes with constantly dynamic content. In the method we propose herein, local invariant descriptors are used to represent the key-frames of video shots and a visual vocabulary is created from these descriptors resulting to a visual words histogram representation (bag of visual words) … Show more
“…Such a result is quite competitive with the state of the art techniques introduced in (Rasheed et al, 2005), (Chasanis et al, 2009), (Zhu et al, 2009) which yield precision/recall rates varying between 65% and 72%.…”
Section: Figure 13 [Detected Scenes]mentioning
confidence: 93%
“…The validation of our scene extraction method has been performed on a corpus of 6 sitcoms and 6 Hollywood movies (Tables 6 and 7) also used for evaluation purposes in the state of the art algorithms presented in (Rasheed et al, 2005), (Chasanis et al, 2009), (Zhu et al, 2009). Fig.…”
This paper introduces a complete framework for temporal video segmentation. First, a computationally efficient shot extraction method is introduced, which adopts the normalized graph partition approach, enriched with a non-linear, multiresolution filtering of the similarity vectors involved. The shot boundary detection technique proposed yields high precision (90%) and recall (95%) rates, for all types of transitions, both abrupt and gradual. Next, for each detected shot, the authors construct a static storyboard by introducing a leap keyframe extraction method. The video abstraction algorithm is 23% faster than existing techniques for similar performances. Finally, the authors propose a shot grouping strategy that iteratively clusters visually similar shots under a set of temporal constraints. Two different types of visual features are exploited: HSV color histograms and interest points. In both cases, the precision and recall rates present average performances of 86%.
“…Such a result is quite competitive with the state of the art techniques introduced in (Rasheed et al, 2005), (Chasanis et al, 2009), (Zhu et al, 2009) which yield precision/recall rates varying between 65% and 72%.…”
Section: Figure 13 [Detected Scenes]mentioning
confidence: 93%
“…The validation of our scene extraction method has been performed on a corpus of 6 sitcoms and 6 Hollywood movies (Tables 6 and 7) also used for evaluation purposes in the state of the art algorithms presented in (Rasheed et al, 2005), (Chasanis et al, 2009), (Zhu et al, 2009). Fig.…”
This paper introduces a complete framework for temporal video segmentation. First, a computationally efficient shot extraction method is introduced, which adopts the normalized graph partition approach, enriched with a non-linear, multiresolution filtering of the similarity vectors involved. The shot boundary detection technique proposed yields high precision (90%) and recall (95%) rates, for all types of transitions, both abrupt and gradual. Next, for each detected shot, the authors construct a static storyboard by introducing a leap keyframe extraction method. The video abstraction algorithm is 23% faster than existing techniques for similar performances. Finally, the authors propose a shot grouping strategy that iteratively clusters visually similar shots under a set of temporal constraints. Two different types of visual features are exploited: HSV color histograms and interest points. In both cases, the precision and recall rates present average performances of 86%.
“…More recent techniques (Chasanis et al, 2009), (Zhu et al, 2009) introduce in the analysis process useful concepts such as temporal constraints and visual similarity.…”
This paper introduces a complete framework for temporal video segmentation. First, a computationally efficient shot extraction method is introduced, which adopts the normalized graph partition approach, enriched with a non-linear, multiresolution filtering of the similarity vectors involved. The shot boundary detection technique proposed yields high precision (90%) and recall (95%) rates, for all types of transitions, both abrupt and gradual. Next, for each detected shot, the authors construct a static storyboard by introducing a leap keyframe extraction method. The video abstraction algorithm is 23% faster than existing techniques for similar performances. Finally, the authors propose a shot grouping strategy that iteratively clusters visually similar shots under a set of temporal constraints. Two different types of visual features are exploited: HSV color histograms and interest points. In both cases, the precision and recall rates present average performances of 86%.
“…Concerning the keyframe visual similarity involved in the above-described process, we have considered two different approaches, based on (1) chi-square distance between HSV color histograms, and (2) the number of matched interest points determined based on SIFT descriptors with a Kd-tree matching technique [20].…”
Abstract-In this paper we propose a novel and complete video scene segmentation framework, developed on different structural levels of analysis. Firstly, a shot boundary detection algorithm is introduced that extends the graph partition method with a nonlinear scale space filtering technique which increase the detection efficiency with gains of 7,4% to 9,8% in terms of both precision and recall rates. Secondly, static storyboards are formed based on a leap keyframe extraction method that selects a variable number of keyframes, adapted to the visual content variation, for each detected shot. Finally using the extracted keyframes, spatio-temporal coherent shots are clustered into the same scene based on temporal constraints and with the help of a new concept of neutralized shots. Video scenes are obtained with average precision and recall rates of 86%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.