We consider the problem of visualizing the evolution of tags within the Flickr (flickr.com) online image sharing community. Any user of the Flickr service may append a tag to any photo in the system. Over the past year, users have on average added over a million tags each week. Understanding the evolution of these tags over time is therefore a challenging task. We present a new approach based on a characterization of the most interesting tags associated with a sliding interval of time. An animation provided via Flash in a web browser allows the user to observe and interact with the interesting tags as they evolve over time.New algorithms and data structures are required to support the efficient generation of this visualization. We combine a novel solution to an interval covering problem with extensions to previous work on score aggregation in order to create an efficient backend system capable of producing visualizations at arbitrary scales on this large dataset in real time.
We consider the problem of visualizing the evolution of tags within the Flickr (flickr.com) online image sharing community. Any user of the Flickr service may append a tag to any photo in the system. Over the past year, users have on average added over a million tags each week. Understanding the evolution of these tags over time is therefore a challenging task. We present a new approach based on a characterization of the most interesting tags associated with a sliding interval of time. An animation provided via Flash in a Web browser allows the user to observe and interact with the interesting tags as they evolve over time. New algorithms and data structures are required to support the efficient generation of this visualization. We combine a novel solution to an interval covering problem with extensions to previous work on score aggregation in order to create an efficient backend system capable of producing visualizations at arbitrary scales on this large dataset in real time.
Probabilistic inference will be of special importance when one needs to know how much we can say with what all we know given new observations. Bayesian Network is a graphical probabilistic model with which one can represent probabilistic relations intuitively and several effective algorithms for inference are developed. This paper reports a now ongoing work in its design stage which provides a vocabulary for representing probabilistic knowledge in a RDF graph which is to be mapped to a Bayesian Network to do inference on it.
Large collections of data are getting published and used more frequently, even by non-statisticians, a situation driven by the mainstreaming of big data, linked data, and open data. Often these datasets are in XML format, consisting of an unknown set of elements, attributes, namespaces, and content models. This paper describes an approach for quickly summarizing as well as guiding exploration into a non-indexed XML database. Finally, this paper demonstrates a statistical technique to approximate faceted search over large datasets, without the need of particular index configurations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.