The top-ranked documents from various information sources that are merged together into a unified ranked list may cover the same piece of relevant information, and cannot satisfy different user needs. Result diversification(RD) solves this problem by diversifying results to cover more information needs. In recent times, RD has attracted much attention as a means of increasing user satisfaction in general purpose search engines. A myriad of approaches have been proposed in the related works for the diversification problem. However, no concrete study of search result diversification has been done in a Distributed Information Retrieval(DIR) setting. In this paper, we survey, classify and propose a theoretical framework that aims at improving diversification at the result merging phase of a DIR environment.
We present StochSS: Stochastic Simulation as a Service, an integrated development environment for modeling and simulation of both deterministic and discrete stochastic biochemical systems in up to three dimensions. An easy to use graphical user interface enables researchers to quickly develop and simulate a biological model on a desktop or laptop, which can then be expanded to incorporate increasing levels of complexity. StochSS features state-of-the-art simulation engines. As the demand for computational power increases, StochSS can seamlessly scale computing resources in the cloud. In addition, StochSS can be deployed as a multi-user software environment where collaborators share computational resources and exchange models via a public model repository. We demonstrate the capabilities and ease of use of StochSS with an example of model development and simulation at increasing levels of complexity.
Collection size is an important feature to represent the content summaries of a collection, and plays a vital role in collection selection for distributed search. In uncooperative environments, collection size estimation algorithms are adopted to estimate the sizes of collections with their search interfaces. This paper proposes heterogeneous capture (HC) algorithm, in which the capture probabilities of documents are modeled with logistic regression. With heterogeneous capture probabilities, HC algorithm estimates collection size through conditional maximum likelihood. Experimental results on real web data show that our HC algorithm outperforms both multiple capture-recapture and capture history algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.