Aleksander Czechowski scite author profile

Aleksander Czechowski

5Publications

11Citation Statements Received

183Citation Statements Given

How they've been cited

How they cite others

119

180

Affiliations

Delft University of Technology, Jagiellonian University

Publications

Order By: Most citations

Decentralized MCTS via Learned Teammate Models

Czechowski

Oliehoek

2020

View full text Add to dashboard Cite

Decentralized online planning can be an attractive paradigm for cooperative multi-agent systems, due to improved scalability and robustness. A key difficulty of such approach lies in making accurate predictions about the decisions of other agents. In this paper, we present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search, combined with models of teammates learned from previous episodic runs. By only allowing one agent to adapt its models at a time, under the assumption of ideal policy approximation, successive iterations of our method are guaranteed to improve joint policies, and eventually lead to convergence to a Nash equilibrium. We test the efficiency of the algorithm by performing experiments in several scenarios of the spatial task allocation environment introduced in [Claes et al., 2015]. We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators which exploit the spatial features of the problem, and that the proposed algorithm improves over the baseline planning performance for particularly challenging domain configurations.

show abstract

Existence of Periodic Solutions of the FitzHugh--Nagumo Equations for an Explicit Range of the Small Parameter

Czechowski

Zgliczyński

2016

SIAM J. Appl. Dyn. Syst.

View full text Add to dashboard Cite

The FitzHugh-Nagumo model describing propagation of nerve impulses in axon is given by fast-slow reaction-diffusion equations, with dependence on a parameter representing the ratio of time scales. It is well known that for all sufficiently small > 0 the system possesses a periodic traveling wave. With aid of computer-assisted rigorous computations, we prove the existence of this periodic orbit in the traveling wave equation for an explicit range ∈ (0, 0.0015]. Our approach is based on a novel method of combination of topological techniques of covering relations and isolating segments, for which we provide a self-contained theory. We show that the range of existence is wide enough, so the upper bound can be reached by standard validated continuation procedures. In particular, for the range ∈ [1.5 × 10 −4 , 0.0015] we perform a rigorous continuation based on covering relations and not specifically tailored to the fast-slow setting. Moreover, we confirm that for = 0.0015 the classical interval Newton-Moore method applied to a sequence of Poincaré maps already succeeds. Techniques described in this paper can be adapted to other fast-slow systems of similar structure.

show abstract

Non-Chaotic Limit Sets in Multi-Agent Learning

Czechowski

Piliouras

2022

Preprint

View full text Add to dashboard Cite

Non-convergence is an inherent aspect of adaptive multi-agent systems, and even basic learning models, such as the replicator dynamics, are not guaranteed to equilibriate. Limit cycles, and even more complicated chaotic sets are in fact possible even in rather simple games, including variants of the Rock-Paper-Scissors game. A key challenge of multi-agent learning theory lies in characterization of these limit sets, based on qualitative features of the underlying game. Although chaotic behavior in learning dynamics can be precluded by the celebrated Poincar\'e-Bendixson theorem, it is only applicable directly to low-dimensional settings. In this work, we attempt to find other characteristics of a game that can force regularity in the limit sets of learning. We show that behavior consistent with the Poincaré-Bendixson theorem (limit cycles, but no chaotic attractor) follows purely from the topological structure of the interaction graph, even for high-dimensional settings with an arbitrary number of players, and arbitrary payoff matrices. We prove our result for a wide class of follow-the-regularized leader (FoReL) dynamics, which generalize replicator dynamics, for binary games characterized interaction graphs where the payoffs of each player are only affected by one other player (i.e., interaction graphs of indegree one). Moreover, we provide simple conditions under which such behavior translates into efficiency guarantees, implying that FoReL learning achievestime-averaged sum of payoffs at least as good as that of a Nash equilibrium, thereby connecting the topology of the dynamics to social-welfare analysis.

show abstract

Influence-aware memory architectures for deep reinforcement learning in POMDPs

Suau

Congeduti

et al. 2022

Neural Comput & Applic

View full text Add to dashboard Cite

Due to its perceptual limitations, an agent may have too little information about the environment to act optimally. In such cases, it is important to keep track of the action-observation history to uncover hidden state information. Recent deep reinforcement learning methods use recurrent neural networks (RNN) to memorize past observations. However, these models are expensive to train and have convergence difficulties, especially when dealing with high dimensional data. In this paper, we propose influence-aware memory, a theoretically inspired memory architecture that alleviates the training difficulties by restricting the input of the recurrent layers to those variables that influence the hidden state information. Moreover, as opposed to standard RNNs, in which every piece of information used for estimating Q values is inevitably fed back into the network for the next prediction, our model allows information to flow without being necessarily stored in the RNN’s internal memory. Results indicate that, by letting the recurrent layers focus on a small fraction of the observation variables while processing the rest of the information with a feedforward neural network, we can outperform standard recurrent architectures both in training speed and policy performance. This approach also reduces runtime and obtains better scores than methods that stack multiple observations to remove partial observability.

show abstract

Influence-aware Memory Architectures for Deep Reinforcement Learning

Suau¹,

He²,

Congeduti³

et al. 2019

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.