2020
DOI: 10.1103/physrevresearch.2.033446
|View full text |Cite
|
Sign up to set email alerts
|

Reinforcement-learning-assisted quantum optimization

Abstract: We propose a reinforcement learning (RL) scheme for feedback quantum control within the quantum approximate optimization algorithm (QAOA). We reformulate the QAOA variational minimization as a learning task, where an RL agent chooses the control parameters for the unitaries, given partial information on the system. Such an RL scheme finds a policy converging to the optimal adiabatic solution of the quantum Ising chain that can also be successfully transferred between systems with different sizes, even in the p… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
38
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 65 publications
(43 citation statements)
references
References 31 publications
0
38
0
Order By: Relevance
“…Recent work has improved the original QAOA, for instance, by aggregating only the best sampled candidate solutions [15] and carefully choosing the mixer operator to improve convergence [26][27][28][29], empirically. Reinforcement learning [30,31], multi-start methods [32], and local optimization [33] help navigate the QAOA optimization landscape. Algorithms such as the Hamiltonian Variational Ansatz produce optimization landscapes that are easier to navigate [34].…”
Section: Introductionmentioning
confidence: 99%
“…Recent work has improved the original QAOA, for instance, by aggregating only the best sampled candidate solutions [15] and carefully choosing the mixer operator to improve convergence [26][27][28][29], empirically. Reinforcement learning [30,31], multi-start methods [32], and local optimization [33] help navigate the QAOA optimization landscape. Algorithms such as the Hamiltonian Variational Ansatz produce optimization landscapes that are easier to navigate [34].…”
Section: Introductionmentioning
confidence: 99%
“…In the learning process, we use a reward setting that reflects the AQC performance on the most difficult factorization instances. Using a soft-actor-critic RL method, we find the learning process has an astonishing convergence speed-it converges within only a few hundred measurement steps, significantly faster as compared to previous studies using RL for quantum state preparation [22], for parameter configuration in quantum approximate optimization [23], and for adiabatic quantum algorithm design [24], which takes about 10 4 to 10 6 measurement steps. The configured AQC algorithm produces an improved success probability, more evenly distributed over different factorization instances as compared with the un-configured algorithm.…”
mentioning
confidence: 77%
“…The variational parameters are optimized using a classical optimization routine. While there are many options for this procedure [34][35][36][37], all of them seek to obtain an approximate solution to the optimization problem by maximizing…”
Section: Quantum Approximate Optimization Algorithmmentioning
confidence: 99%