“…Generally, RL is used to address sequential decision-making problems, in which an abstract agent learns an optimal decision strategy by iteratively interacting with the environment without any prior knowledge . RL has already been widely applied to QIP over the past several years, including quantum state preparation, − quantum gate engineering, , quantum metrology, quantum communication, , quantum heat engine, , quantum state transfer, , quantum state ansatz, and quantum control. − At present, the deep RL technique combining with deep neural networks (DNNs) is actually more popular compared to the classic RL. It takes full advantages of the powerful representation of DNNs and can efficiently deal with challenging high-dimensional optimization problems. , In ref , the deep RL technique has been demonstrated to be superior to traditional optimal control methods for manipulating and controlling multilevel dissipative quantum systems.…”