Recent work has demonstrated that Web search volume can "predict the present," meaning that it can be used to accurately track outcomes such as unemployment levels, auto and home sales, and disease prevalence in near real time. Here we show that what consumers are searching for online can also predict their collective future behavior days or even weeks in advance. Specifically we use search query volume to forecast the opening weekend box-office revenue for feature films, first-month sales of video games, and the rank of songs on the Billboard Hot 100 chart, finding in all cases that search counts are highly predictive of future outcomes. We also find that search counts generally boost the performance of baseline models fit on other publicly available data, where the boost varies from modest to dramatic, depending on the application in question. Finally, we reexamine previous work on tracking flu trends and show that, perhaps surprisingly, the utility of search data relative to a simple autoregressive model is modest. We conclude that in the absence of other data sources, or where small improvements in predictive performance are material, search queries provide a useful guide to the near future.culture | predictions A s people increasingly turn to the Internet for news, information, and research purposes, it is tempting to view online activity at any moment in time as a snapshot of the collective consciousness, reflecting the instantaneous interests, concerns, and intentions of the global population (1, 2). From this perspective, it is a short step to conclude that what people are searching for today is predictive of what they will do in the near future. Consumers contemplating buying a new camera may search to compare models; moviegoers may search to determine the opening date of a new film, or to locate cinemas showing it; and individuals planning a vacation may search for places of interest, to find airline tickets, or to price hotel rooms. If so, it follows that by appropriately aggregating counts of search queries related to retail activity, moviegoing, or travel, one might be able to predict collective behavior of economic, cultural, or political interest. Determining the nature of behavior that can be predicted using search, the accuracy of such predictions, and the time scale over which predictions can be usefully made are therefore all questions of interest.Although previous work has considered the relation between search volume and offline outcomes, researchers have focused on the observation that search "predicts the present" (3, 4), meaning that search volume correlates with contemporaneous events. (8) showed that search volume for handpicked influenza-related queries was correlated with subsequently reported caseloads over the period 2004-2008, and Hulth et al. (9) found similar results in a study of search queries submitted on a Swedish medical Web site. An automated procedure for identifying informative queries is described in Ginsberg et al. (10), and based on that methodology, Google Flu Trend...
Keyword auctions lie at the core of the business models of today's leading search engines. Advertisers bid for placement alongside search results, and are charged for clicks on their ads. Advertisers are typically ranked according to a score that takes into account their bids and potential clickthrough rates. We consider a family of ranking rules that contains those typically used to model Yahoo! and Google's auction designs as special cases. We find that in general neither of these is necessarily revenue-optimal in equilibrium, and that the choice of ranking rule can be guided by considering the correlation between bidders' values and click-through rates. We propose a simple approach to determine a revenue-optimal ranking rule within our family, taking into account effects on advertiser satisfaction and user experience. We illustrate the approach using Monte-Carlo simulations based on distributions fitted to Yahoo! bid and click-through rate data for a high-volume keyword.
Billions of dollars are spent each year on sponsored search, a form of advertising where merchants pay for placement alongside web search results. Slots for ad listings are allocated via an auction-style mechanism where the higher a merchant bids, the more likely his ad is to appear above other ads on the page. In this paper we analyze the incentive, efficiency, and revenue properties of two slot auction designs: "rank by bid" (RBB) and "rank by revenue" (RBR), which correspond to stylized versions of the mechanisms currently used by Yahoo! and Google, respectively. We also consider first-and second-price payment rules together with each of these allocation rules, as both have been used historically. We consider both the "short-run" incomplete information setting and the "long-run" complete information setting. With incomplete information, neither RBB nor RBR are truthful with either first or second pricing. We find that the informational requirements of RBB are much weaker than those of RBR, but that RBR is efficient whereas RBB is not. We also show that no revenue ranking of RBB and RBR is possible given an arbitrary distribution over bidder values and relevance. With complete information, we find that no equilibrium exists with first pricing using either RBB or RBR. We show that there typically exists a multitude of equilibria with second pricing, and we bound the divergence of (economic) value in such equilibria from the value obtained assuming all merchants bid truthfully.
We study fair allocation of indivisible goods to agents with unequal entitlements. Fair allocation has been the subject of many studies in both divisible and indivisible settings. Our emphasis is on the case where the goods are indivisible and agents have unequal entitlements. This problem is a generalization of the work by Procaccia and Wang [20] wherein the agents are assumed to be symmetric with respect to their entitlements. Although Procaccia and Wang show an almost fair (constant approximation) allocation exists in their setting, our main result is in sharp contrast to their observation. We show that, in some cases with n agents, no allocation can guarantee better than 1/n approximation of a fair allocation when the entitlements are not necessarily equal. Furthermore, we devise a simple algorithm that ensures a 1/n approximation guarantee.Our second result is for a restricted version of the problem where the valuation of every agent for each good is bounded by the total value he wishes to receive in a fair allocation. Although this assumption might seem w.l.o.g, we show it enables us to find a 1/2 approximation fair allocation via a greedy algorithm. Finally, we run some experiments on real-world data and show that, in practice, a fair allocation is likely to exist. We also support our experiments by showing positive results for two stochastic variants of the problem, namely stochastic agents and stochastic items.
No abstract
We consider the parallels between the preference elicitation problem in combinatorial auctions and the problem of learning an unknown function from learning theory. We show that learning algorithms can be used as a basis for preference elicitation algorithms. The resulting elicitation algorithms perform a polynomial number of queries. We also give conditions under which the resulting algorithms have polynomial communication. Our conversion procedure allows us to generate combinatorial auction protocols from learning algorithms for polynomials, monotone DNF, and linear-threshold functions. In particular, we obtain an algorithm that elicits XOR bids with polynomial communication.
Abstract. In the standard model of sponsored search auctions, an ad is ranked according to the product of its bid and its estimated click-through rate (known as the quality score), where the estimates are taken as exact. This paper re-examines the form of the efficient ranking rule when uncertainty in click-through rates is taken into account. We provide a sufficient condition under which applying an exponent-strictly less than one-to the quality score improves expected efficiency. The condition holds for a large class of distributions known as natural exponential families, and for the lognormal distribution. An empirical analysis of Yahoo's sponsored search logs reveals that exponent settings substantially smaller than one can be efficient for both high and low volume keywords, implying substantial deviations from the traditional ranking rule.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.