Georgios C. Chasparis scite author profile

We analyze reinforcement learning under so-called "dynamic reinforcement." In reinforcement learning, each agent repeatedly interacts with an unknown environment (i.e., other agents), receives a reward, and updates the probabilities of its next action based on its own previous actions and received rewards. Unlike standard reinforcement learning, dynamic reinforcement uses a combination of long-term rewards and recent rewards to construct myopically forward looking action selection probabilities. We analyze the long-term stability of the learning dynamics for general games with pure strategy Nash equilibria and specialize the results for coordination games and distributed network formation. In this class of problems, more than one stable equilibrium (i.e., coordination configuration) may exist. We demonstrate equilibrium selection under dynamic reinforcement. In particular, we show how a single agent is able to destabilize an equilibrium in favor of another by appropriately adjusting its dynamic reinforcement parameters. We contrast the conclusions with prior game theoretic results according to which the risk-dominant equilibrium is the only robust equilibrium when agents' decisions are subject to small randomized perturbations. The analysis throughout is based on the ODE method for stochastic approximations, where a special form of perturbation in the learning dynamics allows for analyzing its behavior at the boundary points of the state space.

show abstract

Control of preferences in social networks

Chasparis

Shamma

2010

View full text Add to dashboard Cite

Aspiration learning in coordination games

Chasparis

Shamma

Arapostathis

2010

View full text Add to dashboard Cite

We consider the problem of distributed convergence to efficient outcomes in coordination games through payoff-based learning dynamics, namely aspiration learning. The proposed learning scheme assumes that players reinforce well performed actions, by successively playing these actions, otherwise they randomize among alternative actions. Our first contribution is the characterization of the asymptotic behavior of the induced Markov chain of the iterated process by an equivalent finite-state Markov chain, which simplifies previously introduced analysis on aspiration learning. We then characterize explicitly the behavior of the proposed aspiration learning in a generalized version of so-called coordination games, an example of which is network formation games. In particular, we show that in coordination games the expected percentage of time that the efficient action profile is played can become arbitrarily large.

show abstract

Design and implementation of distributed resource management for time-sensitive applications

et al. 2016

View full text Add to dashboard Cite

In this paper, we address distributed convergence to fair allocations of CPU resources for time-sensitive applications. We propose a novel resource management framework where a centralized objective for fair allocations is decomposed into a pair of performance-driven recursive processes for updating: (a) the allocation of computing bandwidth to the applications (resource adaptation), executed by the resource manager, and (b) the service level of each application (service-level adaptation), executed by each application independently. We provide conditions under which the distributed recursive scheme exhibits convergence to solutions of the centralized objective (i.e., fair allocations). Contrary to prior work on centralized optimization schemes, the proposed framework exhibits adaptivity and robustness to changes both in the number and nature of applications, while it assumes minimum information available to both applications and the resource manager. We finally validate our framework with simulations using the TrueTime toolbox in MATLAB/Simulink.

show abstract

A Game-Theoretic Resource Manager for RT Applications

Maggio

Bini

Chasparis

et al. 2013

View full text Add to dashboard Cite

The management of resources among competing QoS-aware applications is often solved by a resource manager (RM) that assigns both the resources and the application service levels. However, this approach requires all applications to inform the RM of the available service levels. Then, the RM has to maximize the 'overall quality' by comparing service levels of different applications which are not necessarily comparable. In this paper we describe a Linux implementation of a game-theoretic framework that decouples the two distinct problems of resource assignment and quality setting, solving them in the domain where they naturally belong to. By this approach the RM has linear time complexity in the number of the applications. Our RM is built over the SCHED-DEADLINE Linux scheduling class

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.