Robots are deployed by a Web search engine for collecting information from different Web servers in order to maintain the currency of its data base of Web pages. In this paper, we investigate the number of robots to be used by a search engine so as to maximize the currency of the data base without putting an unnecessary load on the network. We adopt a finite-buffer queueing model to represent the system. The arrivals to the queuelng system are Web pages brought by the robots; service corresponds to the indexing of these pages. Good performance requires that the number of robots, and thus the arrival rate of the queueing system, be chosen so that the indexing queue is rarely starved or saturated. Thus, we formulate a multi-criteria stochastic optimization problem with the loss rate and empty-buffer probability being the criteria. We take the common approach of reducing the problem to one with a single objective that is a linear function of the given criteria. Both static and dynamic policies can be considered. In the static setting the number of robots is held fixed; in the dynamic setting robots may be re-activated/de-activated as a function of the state. Under the assumption that arrivals form a Poisson process and that service times are independent and exponentially distributed random variables, we determine an optimal decision rule for the dynamic setting, i.e., a rule that varies the number of robots in such a way as to minimize a given linear function of the loss rate and empty-buffer probability. Our results are compared with known results for the static case. A numerical study indicates that substantial gains can be achieved by dynamically controlling the activity of the robots.Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.