In the first part of the paper we consider the problem of dynamically apportioning resources among a set of options in a worst-case on-line framework. The model we study can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting. We show that the multiplicative weightupdate Littlestone Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems. We show how the resulting learning algorithm can be applied to a variety of problems, including gambling, multiple-outcome prediction, repeated games, and prediction of points in R n . In the second part of the paper we apply the multiplicative weight-update technique to derive a new boosting algorithm. This boosting algorithm does not require any prior knowledge about the performance of the weak learning algorithm. We also study generalizations of the new boosting algorithm to the problem of learning functions whose range, rather than being binary, is an arbitrary finite set or a bounded segment of the real line. ] 1997 Academic Press
Abstract. In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received much attention because of the simple model it provides of the trade-off between exploration (trying out each arm to find the best one) and exploitation (playing the arm believed to give the best payoff). Past solutions for the bandit problem have almost always relied on assumptions about the statistics of the slot machines.In this work, we make no statistical assumptions whatsoever about the nature of the process generating the payoffs of the slot machines. We give a solution to the bandit problem in which an adversary, rather than a well-behaved stochastic process, has complete control over the payoffs. In a sequence of T plays, we prove that the per-round payoff of our algorithm approaches that of the best arm at the rate O(T −1/2 ). We show by a matching lower bound that this is the best possible.We also prove that our algorithm approaches the per-round payoff of any set of strategies at a similar rate: if the best strategy is chosen from a pool of N strategies, then our algorithm approaches the per-round payoff of the strategy at the rate O((log N ) 1/2 T −1/2 ). Finally, we apply our results to the problem of playing an unknown repeated matrix game. We show that our algorithm approaches the minimax payoff of the unknown game at the rate O(T −1/2 ).
Abstract. One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and often is observed to decrease even after the training error reaches zero. In this paper, we show that this phenomenon is related to the distribution of margins of the training examples with respect to the generated voting classification rule, where the margin of an example is simply the difference between the number of correct votes and the maximum number of votes received by any incorrect label. We show that techniques used in the analysis of Vapnik's support vector classifiers and of neural networks with small weights can be applied to voting methods to relate the margin distribution to the test error. We also show theoretically and experimentally that boosting is especially effective at increasing the margins of the training examples. Finally, we compare our explanation to those based on the bias-variance decomposition.
We introduce and analyze a new algorithm for linear classification whichcombines Rosenblatt'sperceptron algorithm with Helmbold and Warmuth's leave-one-out method. Like Vapnik's maximalmargin classifier, our algorithm takes advantage of data that are linearly separable with large margins. Compared to Vapnik's algorithm, however, ours is much simpler to implement, and much more efficient in terms of computation time. We also show that our algorithm can be efficiently used in very high dimensional spaces using kernel functions. We performed some experiments using our algorithm, and some variants of it, for classifying images of handwritten digits. The performance of our algorithm is close to, but not as good as, the performance of maximal-margin classifiers on the same problem.
In the multi-armed bandit problem, a gambler must decide which arm of K non-identical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received much attention because of the simple model it provides of the trade-o between exploration (trying out each arm to nd the best one) and exploitation (playing the arm believed to give the best payo ). Past solutions for the bandit problem have almost always relied on assumptions about the statistics of the slot machines.In this work, we make no statistical assumptions whatsoever about the nature of the process generating the payo s of the slot machines. We give a solution to the bandit problem in which an adversary, rather than a well-behaved stochastic process, has complete control over the payo s.In a sequence of T plays, we prove that the expected per-round payo of our algorithm approaches that of the best arm at the rate O(T ?1=2 ), and we give an improved rate of convergence when the best arm has fairly low payo . We also prove a general matching lower bound on the best possible performance of any algorithm in our setting. In addition, we consider a setting in which the player has a team of \experts" advising him on which arm to play; here, we give a strategy that will guarantee expected payo close to that of the best expert. Finally, we apply our result to the problem of learning to play an unknown repeated matrix game against an all-powerful adversary.
Cell motility proceeds by cycles of edge protrusion, adhesion, and retraction. Whether these functions are coordinated by biochemical or biomechanical processes is unknown. We find that myosin II pulls the rear of the lamellipodial actin network, causing upward bending, edge retraction, and initiation of new adhesion sites. The network then separates from the edge and condenses over the myosin. Protrusion resumes as lamellipodial actin regenerates from the front and extends rearward until it reaches newly assembled myosin, initiating the next cycle. Upward bending, observed by evanescence and electron microscopy, results in ruffle formation when adhesion strength is low. Correlative fluorescence and electron microscopy shows that the regenerating lamellipodium forms a cohesive, separable layer of actin above the lamellum. Thus, actin polymerization periodically builds a mechanical link, the lamellipodium, connecting myosin motors with the initiation of adhesion sites, suggesting that the major functions driving motility are coordinated by a biomechanical process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.