Video parsing and browsing using compressed data

Zhang, Hong-Jiang; Low, Chien Yong; Smoliar, Stephen W.

doi:10.1007/bf01261227

Cited by 284 publications

(137 citation statements)

References 12 publications

Supporting

Mentioning

127

Contrasting

Unclassified

Order By: Relevance

“…Each detected shot is represented by one or multiple key frames so that its visual information is captured in the best possible way [4], [7], [20], [23]. Since we are using MPEG compressed video sequences, the key frames are DC images, which are typically 64 times smaller than the original frames (8 8 discrete cosine transform blocks are used).…”

Section: B Intershot Dissimilarity Measurementioning

confidence: 99%

“…Several approaches for the content analysis of still images exist; most of them are based on an analysis and comparison of color, texture, and shape [5], [12], [14]. It is generally accepted that content analysis of video sequences requires a preprocessing procedure that first breaks up the sequences into temporally homogeneous segments called shots [1]- [3], [8], [15], then condenses these segments into one or a few representative frames (key frames) [4], [7], [20], [23], and finally determines the relationship between shots on the basis of their audiovisual characteristics (e.g., audio tracks, key frames). This last step we call video-content organization.…”

mentioning

confidence: 99%

See 1 more Smart Citation

Automated high-level movie segmentation for advanced video-retrieval systems

Hanjalic

Lagendijk

Biemond

1999

IEEE Trans. Circuits Syst. Video Technol.

243

142

View full text Add to dashboard Cite

Abstract-We present a newly developed strategy for automatically segmenting movies into logical story units. A logical story unit can be understood as an approximation of a movie episode, which is a high-level temporal movie segment, characterized either by a single event (dialog, action scene, etc.) or by several events taking place in parallel. Since we consider a whole event and not a single shot to be the most natural retrieval unit for the movie category of video programs, the proposed segmentation is the crucial first step toward a concise and comprehensive contentbased movie representation for browsing and retrieval purposes. The automation aspect is becoming increasingly important with the rising amount of information to be processed in video archives of the future. The segmentation process is designed to work on MPEG-DC sequences, where we have taken into account that at least a partial decoding is required for performing content-based operations on MPEG compressed video streams. The proposed technique allows for carrying out the segmentation procedure in a single pass through a video sequence.Index Terms-Video content analysis, video data bases, video segmentation.

show abstract

Section: B Intershot Dissimilarity Measurementioning

confidence: 99%

mentioning

confidence: 99%

Automated high-level movie segmentation for advanced video-retrieval systems

Hanjalic

Lagendijk

Biemond

1999

IEEE Trans. Circuits Syst. Video Technol.

243

142

View full text Add to dashboard Cite

show abstract

“…This approach builds on the structuring role time naturally plays in multimedia data, whereby content extracted from the audio track through speech recognition, for instance, could provide valuable links to video content. Due to the continuous nature of these media, visualisation and retrieval interfaces based on this approach tend to emphasise linear access (whether sequential or random), often employing a "tape recorder metaphor" and building upon it a set of media and domain specific improvements, such as skimming [7], parsing with compressed data [8] and summary generation [6].…”

Section: Introductionmentioning

confidence: 99%

Exploring the Structure of Media Stream Interactions for Multimedia Browsing

Luz

Bouamrane

2006

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. This paper presents an approach to the issue of adding structure to recordings of collaborative meetings supported by an audio channel and a shared text editor. The virtual meeting environment used is capable of capturing and broadcasting speech, gestures and editing operations in real-time, so recording results in continuous multimedia data. We describe the implementation of a browser which explores simple linkage patterns between these media to support information retrieval through non-linear browsing, and discuss audio segmentation issues arising from this approach.

show abstract

“…Much work has been done with storyboards, in which each keyframe from a video is laid out in time order on a grid [Zhang et al 1995;Christel and Warmack 2001]. Since these algorithms display all the keyframes that they are given, they have not addressed the sampling problem.…”

Section: Video Manga and Other Storyboardsmentioning

confidence: 99%

“…Video search engines typically represent video as a sequence of shots, each represented by a keyframe image [Zhang et al 1995]. (A shot is a continuous sequence from one camera.)…”

Section: Motivationmentioning

confidence: 99%

Constant density displays using diversity sampling

Derthick

Christel

Hauptmann

et al.

IEEE Symposium on Information Visualization 2003 (IEEE Cat. No.03TH8714)

View full text Add to dashboard Cite

The Informedia Digital Video Library user interface summarizes query results with a collage of representative keyframes. We present a user study in which keyframe occlusion caused difficulties. To use the screen space most efficiently to display images, both occlusion and wasted whitespace should be minimized. Thus optimal choices will tend toward constant density displays. However, previous constant density algorithms are based on global density, which leads to occlusion and empty space if the density is not uniform. We introduce an algorithm that considers the layout of individual objects and avoids occlusion altogether. Efficiency concerns are important for dynamic summaries of the Informedia Digital Video Library, which has hundreds of thousands of shots. Posting multiple queries that take into account parameters of the visualization as well as the original query reduces the amount of work required. This greedy algorithm is then compared to an optimal one. The approach is also applicable to visualizations containing complex graphical objects other than images, such as text, icons, or trees..

show abstract

Video parsing and browsing using compressed data

Cited by 284 publications

References 12 publications

Automated high-level movie segmentation for advanced video-retrieval systems

Automated high-level movie segmentation for advanced video-retrieval systems

Exploring the Structure of Media Stream Interactions for Multimedia Browsing

Constant density displays using diversity sampling

Contact Info

Product

Resources

About