2022
DOI: 10.48550/arxiv.2203.10443
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Quantum Multi-Agent Reinforcement Learning via Variational Quantum Circuit Design

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 0 publications
0
5
0
Order By: Relevance
“…The small number of trainable parameters in the RL/MARL regime acts as a vulnerability for the classical neural network. In [30], [31], it is proven that QRL and QMARL can achieve similar performance to a classical RL/MARL. In the results of this paper, the classical neural network with fewer parameters yields lower performance due to two reasons; 1) index embedding on observation, and 2) parameter-shared policy.…”
Section: F Discussionmentioning
confidence: 96%
“…The small number of trainable parameters in the RL/MARL regime acts as a vulnerability for the classical neural network. In [30], [31], it is proven that QRL and QMARL can achieve similar performance to a classical RL/MARL. In the results of this paper, the classical neural network with fewer parameters yields lower performance due to two reasons; 1) index embedding on observation, and 2) parameter-shared policy.…”
Section: F Discussionmentioning
confidence: 96%
“…The quantum circuit uses the three-layer structure of 1-qubit rotation data encoding, variational layers with entanglement gates, and measurement. The paper applies QRL to multi-agent problems and extends the original proposal on quantum CTDE by Yun et al [Yun+22]. An additional step is introduced for the training procedure, resulting in a meta-learning approach.…”
Section: Algorithmicmentioning
confidence: 94%
“…It is not completely clear, whether this two-step training procedure is beneficial in a general setup. (see [Yun+22])…”
Section: Algorithmicmentioning
confidence: 99%
See 1 more Smart Citation
“…Several improved quantum policy gradient algorithms have been proposed, including actor-critic [24], soft actor-critic (SAC) [9], [25], deep deterministic policy gradient (DDPG) [26], quantum asynchronous advantage actor-critic (A3C) [27], and generative adversarial RL [28], aiming to enhance the efficiency and effectiveness of QRL methods. QRL has found applications in quantum control [29] and has been extended to multi-agent settings with promising results [30]- [32]. Chen et al [10] explored the use of evolutionary optimization in QRL, employing parallel training and selection of the bestperforming agents as parents for the next generation.…”
Section: Related Workmentioning
confidence: 99%