Video Search with Context-Aware Ranker and Relevance Feedback

Lokoč, Jakub; Mejzlík, František; Souček, Tomáš; Dokoupil, Patrik; Peška, Ladislav

doi:10.1007/978-3-030-98355-0_46

Cited by 16 publications

(8 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…[99] VISIONE [31] OpenCLIP ViT-L/14 trained with LAION-400m [60] diveXplore [100] OpenCLIP ViT-B/32 trained with LAION-2B [12], [60] 4MR [34] OpenCLIP ViT-B/32 xlm roberta base model trained with LAION-5B [13], [60] vitrivr [96] vitrivr-VR [107] CLIP [5], [86] CVHunter [71] vitrivr [96] vitrivr-VR [107] CLIP2Video [6], [45] VISIONE [31] BLIP [3], [66] QIVISE [103] CLIP4Clip [7], [77] VIREO [79] Custom cross-modal network [20], [46] combining multiple textual and visual features and employing OpenCLIP ViT-B/32 [60], [86], ResNet-152 [53], and ResNeXt-101 [80] Verge [84] ITV [116] VIREO [79] ALADIN [2], [81] VISIONE [31] custom model [24], [105] vitrivr [96] vitrivr-VR [107] The VBS systems have greatly evolved in recent years, offering innovative approaches to efficiently explore and retrieve information from large video collections. Almost all these systems exploit joint text-visual embeddings to enhance the search experience and provide more accurate results.…”

Section: Model Systemmentioning

confidence: 99%

Evaluating Performance and Trends in Interactive Video Retrieval: Insights From the 12th VBS Competition

Vadicamo,

Arnold,

Bailer

et al. 2024

IEEE Access

View full text Add to dashboard Cite

This paper conducts a thorough examination of the 12th Video Browser Showdown (VBS) competition, which is a well-established international benchmarking campaign for interactive video search systems. The annual VBS competition has witnessed a steep rise in the popularity of multimodal embedding-based approaches in interactive video retrieval. The majority of the thirteen systems participating in VBS 2023 utilized a CLIP-based cross-modal search model, allowing the specification of free-form text queries to search visual content. This shared emphasis on joint embedding models contributed to balanced performance across various teams. However, the distinguishing factors of the top-performing teams included the adept combination of multiple models and search modes, along with the capabilities of interactive interfaces to facilitate and refine the search process. Our work provides an overview of the state-of-the-art approaches employed by the participating systems and conducts a thorough analysis of their search logs, which record user interactions and results of their queries for each task. Our comprehensive examination of the VBS competition offers assessments of the effectiveness of the retrieval models employed, the browsing efficiency, and user query patterns. Additionally, it provides valuable insights into the evolving landscape of interactive video retrieval and its future challenges.

show abstract

Section: Model Systemmentioning

confidence: 99%

Evaluating Performance and Trends in Interactive Video Retrieval: Insights From the 12th VBS Competition

Vadicamo,

Arnold,

Bailer

et al. 2024

IEEE Access

View full text Add to dashboard Cite

show abstract

“…The CVHunter [19] system was tested as a "rapiddevelopment" based application (in WPF .NET) created in a short time period before the competition. The application used the same metadata as SOMHunter+ and provided basic browsing functions like ranked set scrolling, day summary browsing, and query by example image search.…”

Section: Participant Team Overviewsmentioning

confidence: 99%

Comparing Interactive Retrieval Approaches at the Lifelog Search Challenge 2021

et al. 2023

View full text Add to dashboard Cite

The Lifelog Search Challenge (LSC) is an interactive benchmarking evaluation workshop for lifelog retrieval systems. The challenge was first organised in 2018 aiming to find the system that can quickly retrieve relevant lifelog images for a given semantic query. This paper provides an analysis of the performance of all 17 systems participating in the 4th LSC workshop held at the 2021 Annual ACM International Conference on Multimedia Retrieval (ICMR). LSC'21 was the largest effort at comparing different approaches to interactive lifelog retrieval systems seen thus far. Findings from the challenge suggest that many different interactive factors contribute to the success (or otherwise) of participating teams. In this paper, we provide an overview of the LSC'21 challenge, introduce each team's approach and explore these factors in depth and offer clues on how to develop a high-performing interactive lifelog search engine. INDEX TERMS lifelog, information retrieval, multimodal, analyticsAt LSC'21, each of the participating teams brought a unique and customised search engine to the challenge. In this paper, we introduce the LSC challenge, describe all competing systems, and highlight the techniques and components that are employed in state-of-the-art interactive lifelog

show abstract

“…Even though the performance of the video search systems vibro [13], CVHunter [14] and Visione [15] was quite similar in the VBS 2022, the video browsing tools have signiőcant differences regarding their supported query modalities, underlying ranking models, presentation of retrieval results and browsing capabilities. However, the general approach of splitting up videos into segments (shots) and deőning a representative frame (image) for each segment is used by all three systems with small differences in this procedure.…”

Section: Description Of the Systemsmentioning

confidence: 99%

Interactive Multimodal Video Search: An Extended Post-Evaluation for the VBS 2022 Competition

Schall,

Bailer,

Barthel

et al. 2023

Preprint

Self Cite

View full text Add to dashboard Cite

CLIP-based text-to-image retrieval has proven to be very effective at the interactive video retrieval competition Video Browser Showdown 2022, where all three top-scoring teams had implemented a variant of a CLIP model in their system. Since the performance of these three systems was quite close, this post-evaluation was designed to get better insights on the differences of the systems and compare the CLIP-based text-query retrieval engines by introducing slight modifications to the original competition settings. An extended analysis of the overall results and the retrieval performance of all systems' functionalities shows that a strong text retrieval model certainly helps, but has to be coupled with extensive browsing capabilities and other query-modalities to consistently solve known-item-search tasks in a large scale video database.

show abstract

Video Search with Context-Aware Ranker and Relevance Feedback

Cited by 16 publications

References 17 publications

Evaluating Performance and Trends in Interactive Video Retrieval: Insights From the 12th VBS Competition

Evaluating Performance and Trends in Interactive Video Retrieval: Insights From the 12th VBS Competition

Comparing Interactive Retrieval Approaches at the Lifelog Search Challenge 2021

Interactive Multimodal Video Search: An Extended Post-Evaluation for the VBS 2022 Competition

Contact Info

Product

Resources

About