1995
DOI: 10.1007/bf01261227
|View full text |Cite
|
Sign up to set email alerts
|

Video parsing and browsing using compressed data

Abstract: Abstract. Parsing video content is an important first step in the video indexing process. This paper presents algorithms to automate the video parsing task, including partitioning a source video into clips and classifying those clips according to camera operations, using compressed video data. We have developed two algorithms and a hybrid approach to partitioning video data compressed according to the JPEG and MPEG standards. The algorithms utilize both the video content encoded in DCT (Discrete Cosine Transfo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
127
0
7

Year Published

1998
1998
2007
2007

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 284 publications
(137 citation statements)
references
References 12 publications
0
127
0
7
Order By: Relevance
“…Each detected shot is represented by one or multiple key frames so that its visual information is captured in the best possible way [4], [7], [20], [23]. Since we are using MPEG compressed video sequences, the key frames are DC images, which are typically 64 times smaller than the original frames (8 8 discrete cosine transform blocks are used).…”
Section: B Intershot Dissimilarity Measurementioning
confidence: 99%
See 1 more Smart Citation
“…Each detected shot is represented by one or multiple key frames so that its visual information is captured in the best possible way [4], [7], [20], [23]. Since we are using MPEG compressed video sequences, the key frames are DC images, which are typically 64 times smaller than the original frames (8 8 discrete cosine transform blocks are used).…”
Section: B Intershot Dissimilarity Measurementioning
confidence: 99%
“…Several approaches for the content analysis of still images exist; most of them are based on an analysis and comparison of color, texture, and shape [5], [12], [14]. It is generally accepted that content analysis of video sequences requires a preprocessing procedure that first breaks up the sequences into temporally homogeneous segments called shots [1]- [3], [8], [15], then condenses these segments into one or a few representative frames (key frames) [4], [7], [20], [23], and finally determines the relationship between shots on the basis of their audiovisual characteristics (e.g., audio tracks, key frames). This last step we call video-content organization.…”
mentioning
confidence: 99%
“…This approach builds on the structuring role time naturally plays in multimedia data, whereby content extracted from the audio track through speech recognition, for instance, could provide valuable links to video content. Due to the continuous nature of these media, visualisation and retrieval interfaces based on this approach tend to emphasise linear access (whether sequential or random), often employing a "tape recorder metaphor" and building upon it a set of media and domain specific improvements, such as skimming [7], parsing with compressed data [8] and summary generation [6].…”
Section: Introductionmentioning
confidence: 99%
“…Much work has been done with storyboards, in which each keyframe from a video is laid out in time order on a grid [Zhang et al 1995;Christel and Warmack 2001]. Since these algorithms display all the keyframes that they are given, they have not addressed the sampling problem.…”
Section: Video Manga and Other Storyboardsmentioning
confidence: 99%
“…Video search engines typically represent video as a sequence of shots, each represented by a keyframe image [Zhang et al 1995]. (A shot is a continuous sequence from one camera.)…”
Section: Motivationmentioning
confidence: 99%