Abstract-An outer bound on the rate region of noise-free information networks is given. This outer bound combines properties of entropy with a strong information inequality derived from the structure of the network. This blend of information theoretic and graph theoretic arguments generates many interesting results. For example, the capacity of directed cycles is characterized. Also, a gap between the sparsity of an undirected graph and its capacity is shown. Extending this result, it is shown that multicommodity flow solutions achieve the capacity in an infinite class of undirected graphs, thereby making progress on a conjecture of Li and Li. This result is in sharp contrast to the situation with directed graphs, where a family of graphs are presented in which the gap between the capacity and the rate achievable using multicommodity flows is linear in the size of the graph.
Abstract-An outer bound on the rate region of noise-free information networks is given. This outer bound combines properties of entropy with a strong information inequality derived from the structure of the network. This blend of information theoretic and graph theoretic arguments generates many interesting results. For example, the capacity of directed cycles is characterized. Also, a gap between the sparsity of an undirected graph and its capacity is shown. Extending this result, it is shown that multicommodity flow solutions achieve the capacity in an infinite class of undirected graphs, thereby making progress on a conjecture of Li and Li. This result is in sharp contrast to the situation with directed graphs, where a family of graphs are presented in which the gap between the capacity and the rate achievable using multicommodity flows is linear in the size of the graph.
We consider the problem of duplicate document detection for search evaluation. Given a query and a small number of web results for that query, we show how to detect duplicate web documents with precision ∼ 0.91 and recall ∼ 0.77. In contrast, Charikar's algorithm, designed for duplicate detection in an indexing pipeline, achieves precision ∼ 0.91 but with a recall of ∼ 0.58. Our improvement in recall while maintaining high precision comes from combining three ideas. First, because we are only concerned with duplicate detection among results for the same query, the number of pairwise comparisons is small. Therefore we can afford to compute multiple pairwise signals for each pair of documents. A model learned with standard machine-learning techniques improves recall to ∼ 0.68 with precision ∼ 0.90. Second, most duplicate detection has focused on text analysis of the HTML contents of a document. In some web pages the HTML is not a good indicator of the final contents of the page. We use extended fetching techniques to fill in frames and execute Javascript. Including signals based on our richer fetches further improves the recall to ∼ 0.75 and the precision to ∼ 0.91. Finally, we also explore using signals based on the query. Comparing contextual snippets based on the richer fetches improves the recall to ∼ 0.77. We show that the overall accuracy of this final model approaches that of human judges.
In traditional game theory, players are typically endowed with exogenously given knowledge of the structure of the game-either full omniscient knowledge or partial but fixed information. In real life, however, people are often unaware of the utility of taking a particular action until they perform research into its consequences. In this paper, we model this phenomenon. We imagine a player engaged in a question-and-answer session, asking questions both about his or her own preferences and about the state of reality; thus we call this setting "Socratic" game theory. In a Socratic game, players begin with an a priori probability distribution over many possible worlds, with a different utility function for each world. Players can make queries, at some cost, to learn partial information about which of the possible worlds is the actual world, before choosing an action. We consider two query models: (1) an unobservable-query model, in which players learn only the response to their own queries, and (2) an observable-query model, in which players also learn which queries their opponents made.The results in this paper consider cases in which the underlying worlds of a two-player Socratic game are either constant-sum games or strategically zero-sum games, a class that generalizes constant-sum games to include all games in which the sum of payoffs depends linearly on the interaction between the players. When the underlying worlds are constant sum, we give polynomial-time algorithms to find Nash equilibria in both the observable-and unobservablequery models. When the worlds are strategically zero sum, we give efficient algorithms to find Nash equilibria in unobservable-query Socratic games and correlated equilibria in observablequery Socratic games.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.