The relative effectiveness of concept-based versus content-based video retrieval

Yang, Mu-Hoe; Wildemuth, Barbara M.; Marchionini, Gary

doi:10.1145/1027527.1027613

Cited by 21 publications

(33 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Section: Related Researchmentioning

confidence: 99%

“…Automatic Speech Recognition (ASR) technology has been developed to turn audio into text (Christel et al, 1998) and to provide textual description of the video content. Even though the quality of the ASR transcript is usually not as good as the human generat ed video description, they are still the primary data resource for shot level video retrieval systems (Mezaris et al., 2005;Wildemuth et al, 2004;Amir et al, 2004;Heesch et al, 2004;Cooke et al, 2004).In video retrieval, various browsing technologies are widely supported to augment text based query search, in particular when exact queries are hard to form (Carmel et al 1992). This may be because human beings are good at rapidly fi nding patterns, recognizing objects, generalizing or inferring information fro m limited data, and making relevance decisions (Helander, 1998; Shneiderman, 1998) For shot level content-based retrieval (where a shot represents a series of consecutive frames with no sudden transition), temporal neighbor browsing is the most common navigation method (Heesch et al, 2004;Wildemuth et al, 2003).…”

mentioning

confidence: 99%

“…Heesch et al (2004) found that network browsing was particularly helpful for relatively hard queries where low-level physical features (e.g., color, texture, and layout) were less informative. Yang et al (2004) demonstrated that the retrieval recall and precision between two system designs, consisting of transcript based systems (transcript only and transcript + high-level fea tures) and feature-based system (high-level feature only), were directly linked to the search tasks. Two types of tasks were defined by the author in their study, generic topic related tasks (e.g., a kind of person, object, event, action, and geographic location) and specific topic related tasks (e.g., named person, object, event, action, and geographic loc ation).…”

mentioning

confidence: 99%

See 2 more Smart Citations

Semantic visual features in content‐based video retrieval

2006

Proc of Assoc for Info

View full text Add to dashboard Cite

A new semantic visual features (e.g., car, mountain, and fire) navigation technology is proposed to improve the effectiveness of video retrieval. Traditional temporal neighbor browsing technology allows users to navigate temporal neighbors of a selected sample frame to find additional matches, while semantic visual feature browsing enables users to navigate keyframes that have similar features to the selected sample frame. A pilot evaluation was conducted to compare the effectiveness of three video retrieval designs that support 1) temporal neighbor browsing; 2) semantic visual feature browsing; and 3) fused browsing which is a combination of both temporal neighbor and semantic visual feature browsing. Two types of searching tasks: visual centric and non-visual centric tasks were applied. Initial results indicated that the semantic visual feature browsing system was more efficient for non-visual centric tasks. IntroductionAccess to digital video from news sources such as CNN, MSNBC, or ABC has become commonplace. To make digital multimedia resource discovery and search more convenient, multimedia digital libraries are being developed for research and education.Increasingly, students or instructors are consulting video col lections in search of video shots within larger video "documents" to be used in their projects or lectures. Viewing all videos in full length to find the desired video shots may be feasible for a small collection, but can be very time intensive for a large collection. The ability to search within individual videos, much in the same way that full text searching allows users to search for content instead of their bibliographic surrogates, would g reatly increase access to video content. Recent research on content-based video retrieval indicated that initially performing a text-based query and subsequently proceeding with neighbor or visual similarity browsing proved to be an effective retrieval strategy (Wildemuth et al., 2003;Heesch et al., 2004;Mezaris et al., 2004 ; Amir et al., 2005). Human beings are usually good at pattern recognition through navigation. A retrieval system supporting navigation functions would provide users additional means for content rel ated searching tasks.In this paper we propose a new video content browsing techniqu e: semantic visual feature browsing. Our purpose is to evaluate its effectiveness as compared to traditional temporal neighbor browsing technique for two types of retrieval tasks: visual centric tasks and non-visual centric tasks. After the introduction of related research, a description of the semantic visual feature browsing algorithm will be given. The user interface of a prototype web-based video retrieval system that supports semantic visual feature browsing will be then illustrated. Finally, the methodology of a pilot user study and some initial results from the study will be presented, fol lowed by a brief discussion. Related ResearchVideo retrieval in the context of a digital library has only recently begun to be studied from a research perspective...

show abstract

Section: Related Researchmentioning

confidence: 99%

mentioning

confidence: 99%

mentioning

confidence: 99%

See 1 more Smart Citation

Semantic visual features in content‐based video retrieval

2006

Proc of Assoc for Info

View full text Add to dashboard Cite

show abstract

“…Interactive search in particular can benefit from this knowledge, since the user plays such a central role in the process. Studies have been done to measure usability of interactive retrieval systems (e.g [4]) and effectiveness of different components of these systems ( [26]). In this paper we investigate the still unclear impact of user-behaviour and user-characteristics on the performance of interactive retrieval systems.…”

Section: Introductionmentioning

confidence: 99%

Assessing user behaviour in news video retrieval

Hollink

Nguyen

Koelma

et al. 2005

IEE Proc., Vis. Image Process.

View full text Add to dashboard Cite

Abstract. In this paper we present the results of a study in which we assess search behaviour of people querying a news archive using an interactive video retrieval system. 242 Search sessions by 39 participants on 24 topics were analysed. Before, during and after the study, participants filled in questionnaires about their expectations of a search. The questionnaire data, logged user actions on the system, queries formulated by users, and a quality measure of each search were studied. The results of the study show that topics concerning 'specific' people or objects were better retrieved than topics concerning 'general' objects and scenes. Users were able to estimate the overall quality of a search but did not know when the optimal result was reached within the search process. Analysis of the results at various stages in the retrieval process suggests that retrieval based on transcriptions of the speech in video data adds more to the average precision of the result than content-based image retrieval based on low-level visual features. The latter is particularly useful in providing the user with an overview of the dataset and thus an indication of the success of a search. Based on the results we discuss implications for the design of user interfaces of video retrieval systems.

show abstract

“…There has been less work done in conducting user studies with regard to assessing the effectiveness of the cross-media hypothesis. In [15], the authors report on TRECVID-2003 interactive search task by comparing three systems' performances (text only, feature only, combined). According to the findings, the system which combined both text and other modal features did not perform well as expected.…”

Section: Related Workmentioning

confidence: 99%

Investigation of the Effectiveness of Cross-Media Indexing

Yakici

Crestani

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. Cross-media analysis and indexing leverages the individual potential of each indexing information provided by different modalities, such as speech, text and image, to improve the effectiveness of information retrieval and filtering in later stages. The process does not only constitute generating a merged representation of the digital content, such as MPEG-7, but also enriching it in order to help remedy the imprecision and noise introduced during the low-level analysis phases. It has been hypothesized that a system that combines different media descriptions of the same multi-modal audio-visual segment in a semantic space will perform better at retrieval and filtering time. In order to validate this hypothesis, we have developed a cross-media indexing system which utilises the Multiple Evidence approach by establishing links among the modality specific textual descriptions in order to depict topical similarity.

show abstract

The relative effectiveness of concept-based versus content-based video retrieval

Cited by 21 publications

References 6 publications

Semantic visual features in content‐based video retrieval

Semantic visual features in content‐based video retrieval

Assessing user behaviour in news video retrieval

Investigation of the Effectiveness of Cross-Media Indexing

Contact Info

Product

Resources

About