We present an efficient query evaluation method based on a two level approach: at the first level, our method iterates in parallel over query term postings and identifies candidate documents using an approximate evaluation taking into account only partial information on term occurrences and no query independent factors; at the second level, promising candidates are fully evaluated and their exact scores are computed. The efficiency of the evaluation process can be improved significantly using dynamic pruning techniques with very little cost in effectiveness. The amount of pruning can be controlled by the user as a function of time allocated for query evaluation. Experimentally, using the TREC Web Track data, we have determined that our algorithm significantly reduces the total number of full evaluations by more than 90%, almost without any loss in precision or recall.At the heart of our approach there is an efficient implementation of a new Boolean construct called WAND or Weak AND that might be of independent interest.
SummaryWhen faced with ambiguous sensory input, conscious awareness may alternate between the different percepts that are consistent with the input. Visual phenomena leading to such multistable perception, where constant sensory input evokes different conscious percepts, are particularly useful for investigating the processes underlying perceptual awareness [1]. Understanding the role that high-level brain regions outside early visual cortex play in perceptual alternations could elucidate how top-down processes modulate conscious perception [2]. In two studies [3,4] published recently in Current Biology, different combinations of the present authors used repetitive transcranial magnetic stimulation (rTMS) to disrupt activity in human superior parietal cortex, and reported seemingly contradictory results [5] concerning the effect of disrupting the normal function of this area on bistable perception. Here we join forces to resolve this discrepancy.
This work tries to answer the question of what makes a query difficult. It addresses a novel model that captures the main components of a topic and the relationship between those components and topic difficulty. The three components of a topic are the textual expression describing the information need (the query or queries), the set of documents relevant to the topic (the Qrels), and the entire collection of documents. We show experimentally that topic difficulty strongly depends on the distances between these components. In the absence of knowledge about one of the model components, the model is still useful by approximating the missing component based on the other components. We demonstrate the applicability of the difficulty model for several uses such as predicting query difficulty, predicting the number of topic aspects expected to be covered by the search results, and analyzing the findability of a specific domain.
Predicting query performance, that is, the effectiveness of a search performed in response to a query, is a highly important and challenging problem. Our novel approach to addressing this challenge is based on estimating the potential amount of query drift in the result list, i.e., the presence (and dominance) of aspects or topics not related to the query in top-retrieved documents. We argue that query-drift can potentially be estimated by measuring the diversity (e.g., standard deviation) of the retrieval scores of these documents. Empirical evaluation demonstrates the prediction effectiveness of our approach for several retrieval models. Specifically, the prediction success is better, over most tested TREC corpora, than that of state-of-the-art prediction methods.
This experiment was designed to examine the external validity of the standard mock-crime procedure used extensively to evaluate the validity of polygraph tests. The authors manipulated the type of mock-crime procedure (standard vs. a more realistic version) and the time of test (immediate vs. delayed) and examined their effects on the validity of the Guilty Knowledge Test (GKT) and the recall rate of the relevant items. The results indicated that only the type of mock-crime affected the 2 outcome variables. The realistic procedure was associated with a lower recall rate and weaker detection efficiency than the standard procedure. However, these effects were mediated by the type of GKT questions used. Practical implications of these results are discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.