Abstract:Digital image collections contain a wealth of information, which for instance can be used to trace illegal activities and investigate criminal networks. We present a method that enables analysts to reveal relations among people, based on the patterns in their collections. Similar temporal and spatial patterns can be found using a parameterized algorithm, visualization is used to choose the right parameters and to inspect the patterns found. The visualization shows relations between image properties: the person… Show more
“…Where this article focuses on the effect of the shuffled ImageNet bank for event detection and search, the concept bank is of interest in more research problems. The previous iteration of the bank [40] has already found applications in video captioning [11], visualizing image collections [61], and detecting violence in videos [32] amongst others. We hope that by making the shuffled ImageNet banks of this article publicly available, a broad range of multimedia problems can benefit from the concept bank representations.…”
This article aims for the detection and search of events in videos, where video examples are either scarce or even absent during training. To enable such event detection and search, ImageNet concept banks have shown to be effective. Rather than employing the standard concept bank of 1,000 ImageNet classes, we leverage the full 21,841-class dataset. We identify two problems with using the full dataset: (i) there is an imbalance between the number of examples per concept, and (ii) not all concepts are equally relevant for events. In this article, we propose to balance large-scale image hierarchies for pre-training. We shuffle concepts based on bottom-up and top-down operations to overcome the problems of example imbalance and concept relevance. Using this strategy, we arrive at the shuffled ImageNet bank, a concept bank with an order of magnitude more concepts compared to standard ImageNet banks. Compared to standard ImageNet pre-training, our shuffles result in more discriminative representations to train event models from the limited video event examples. For event search, the broad range of concepts enable a closer match between textual queries of events and concept detections in videos. Experimentally, we show the benefit of the proposed bank for event detection and event search, with state-of-the-art performance for both tasks on the challenging TRECVID Multimedia Event Detection and Ad-Hoc Video Search benchmarks.
“…Where this article focuses on the effect of the shuffled ImageNet bank for event detection and search, the concept bank is of interest in more research problems. The previous iteration of the bank [40] has already found applications in video captioning [11], visualizing image collections [61], and detecting violence in videos [32] amongst others. We hope that by making the shuffled ImageNet banks of this article publicly available, a broad range of multimedia problems can benefit from the concept bank representations.…”
This article aims for the detection and search of events in videos, where video examples are either scarce or even absent during training. To enable such event detection and search, ImageNet concept banks have shown to be effective. Rather than employing the standard concept bank of 1,000 ImageNet classes, we leverage the full 21,841-class dataset. We identify two problems with using the full dataset: (i) there is an imbalance between the number of examples per concept, and (ii) not all concepts are equally relevant for events. In this article, we propose to balance large-scale image hierarchies for pre-training. We shuffle concepts based on bottom-up and top-down operations to overcome the problems of example imbalance and concept relevance. Using this strategy, we arrive at the shuffled ImageNet bank, a concept bank with an order of magnitude more concepts compared to standard ImageNet banks. Compared to standard ImageNet pre-training, our shuffles result in more discriminative representations to train event models from the limited video event examples. For event search, the broad range of concepts enable a closer match between textual queries of events and concept detections in videos. Experimentally, we show the benefit of the proposed bank for event detection and event search, with state-of-the-art performance for both tasks on the challenging TRECVID Multimedia Event Detection and Ad-Hoc Video Search benchmarks.
“… 2010 ) arranged all images and associate metadata in a tabular layout. PICTuReVis (van der Corput and van Wijk 2017 ) showed that relations among people can be revealed based on image collections. StreetVizor (Shen et al.…”
Many visual analytics have been developed for examining scientific publications comprising wealthy data such as authors and citations. The studies provide unprecedented insights on a variety of applications, e.g., literature review and collaboration analysis. However, visual information (e.g., figures) that is widely employed for storytelling and methods description are often neglected. We present VIStory, an interactive storyboard for exploring visual information in scientific publications. We harvest a new dataset of a large corpora of figures, using an automatic figure extraction method. Each figure contains various attributes such as dominant color and width/ height ratio, together with faceted metadata of the publication including venues, authors, and keywords. To depict these information, we develop an intuitive interface consisting of three components: (1) Faceted View enables efficient query by publication metadata, benefiting from a nested table structure, (2) Storyboard View arranges paper rings-a well-designed glyph for depicting figure attributes, in a themeriver layout to reveal temporal trends, and (3) Endgame View presents a highlighted figure together with the publication metadata. We illustrate the applicability of VIStory with case studies on two datasets, i.e., 10-year IEEE VIS publications, and publications by a research team at CVPR, ICCV, and ECCV conferences. Quantitative and qualitative results from a formal user study demonstrate the efficiency of VIStory in exploring visual information in scientific publications. Keywords Document visualization Á Image browser Á Faceted metadata 1 Introduction Publications are one of the most important outcomes of scientific research. Together with the development of science itself, substantial amounts of scientific publications have been generated. Though digital libraries like Google Scholar and Microsoft Academic provide powerful searching and browsing functionalities, they are often found ineffective for high-level tasks such as collaboration analysis. Visual analytics has gained intense interest in exploring scientific publications, as it can enable human cognition and reasoning with machine's powerful computing capacity (Keim et al. 2008). Vast amounts of visual analytics have been developed that facilitate applications including literature review and citation analysis (e.g.
“…Serving as the epicenter of research on interactive multimedia retrieval, the initiatives such as Video Browser Showdown produced a number of excellent analytics systems [13]. [24] GraphViz [9] PIWI [35] Newdle [36] Gephi [2] CoMeRDA [5] Blackthorn [38] vitrivr [20] SIRET [12] Vibro [1] PICTuReVis [29] ISOLDE For example, vitrivr system owes it good performance in interactive multimedia retrieval to an indexing structure for efficient kNN search [20]. Similarly, SIRET tool facilitates interactive video retrieval using several querying strategies, i.e.…”
Section: Multimedia Analyticsmentioning
confidence: 99%
“…Finally, while PICTuReVis [29] facilitates interactive learning for revealing relations between users based on their patterns of multimedia consumption, it is not designed for search and exploration of large social multimedia networks, but rather forensic analysis of artifacts from e.g. confiscated electronic devices, featuring a limited number of users.…”
In this paper we present a novel interactive multimodal learning system, which facilitates search and exploration in large networks of social multimedia users. It allows the analyst to identify and select users of interest, and to find similar users in an interactive learning setting. Our approach is based on novel multimodal representations of users, words and concepts, which we simultaneously learn by deploying a general-purpose neural embedding model. We show these representations to be useful not only for categorizing users, but also for automatically generating user and community profiles. Inspired by traditional summarization approaches, we create the profiles by selecting diverse and representative content from all available modalities, i.e. the text, image and user modality. The usefulness of the approach is evaluated using artificial actors, which simulate user behavior in a relevance feedback scenario. Multiple experiments were conducted in order to evaluate the quality of our multimodal representations, to compare different embedding strategies, and to determine the importance of different modalities. We demonstrate the capabilities of the proposed approach on two different multimedia collections originating from the violent online extremism forum Stormfront and the microblogging platform Twitter, which are particularly interesting due to the high semantic level of the discussions they feature.
CCS CONCEPTS• Information systems → Multimedia and multimodal retrieval. KEYWORDS multimedia analytics, search, exploration, interactive learning, multimodal embeddings, online discussion forums, social multimedia • First, compact but meaningful multimodal content representations are needed to ensure the interactivity of the system
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.