Reinforcement learning involves decision-making in dynamic and uncertain environments and constitutes a crucial element of artificial intelligence. In our previous work, we experimentally demonstrated that the ultrafast chaotic oscillatory dynamics of lasers can be used to efficiently solve the two-armed bandit problem, which requires decision-making concerning a class of difficult trade-offs called the exploration–exploitation dilemma. However, only two selections were employed in that research; hence, the scalability of the laser-chaos-based reinforcement learning should be clarified. In this study, we demonstrated a scalable, pipelined principle of resolving the multi-armed bandit problem by introducing time-division multiplexing of chaotically oscillated ultrafast time series. The experimental demonstrations in which bandit problems with up to 64 arms were successfully solved are presented where laser chaos time series significantly outperforms quasiperiodic signals, computer-generated pseudorandom numbers, and coloured noise. Detailed analyses are also provided that include performance comparisons among laser chaos signals generated in different physical conditions, which coincide with the diffusivity inherent in the time series. This study paves the way for ultrafast reinforcement learning by taking advantage of the ultrahigh bandwidths of light wave and practical enabling technologies.
We investigate the effect of a memory parameter on the performance of adaptive decision making using a tug-of-war method with the chaotic oscillatory dynamics of a semiconductor laser. We experimentally generate chaotic temporal waveforms of the semiconductor laser with optical feedback and apply them for adaptive decision making in solving a multiarmed bandit problem that aims at maximizing the total reward from slot machines whose hit probabilities are dynamically switched. We examine the dependence of making correct decisions on different values of the memory parameter. The degree of adaptivity is found to be enhanced with a smaller memory parameter, whereas the degree of convergence to the correct decision is higher for a larger memory parameter. The relations among the adaptivity, environmental changes, and the difficulties of the problem are also discussed considering the requirement of past decisions. This examination of ultrafast adaptive decision making highlights the importance of memorizing past events and paves the way for future photonic intelligence.
Photonic technologies are promising for solving complex tasks in artificial intelligence. In this paper, we numerically investigate decision making for solving the multi-armed bandit problem using lag synchronization of chaos in a ring laser-network configuration. We construct a laser network consisting of unidirectionally coupled semiconductor lasers, whereby spontaneous exchange of the leader-laggard relationship in the lag synchronization of chaos is observed. We succeed in solving the multi-armed bandit problems with three slot machines using lag synchronization of chaos by controlling the coupling strengths among the three lasers. Furthermore, we investigate the scalability of the proposed decision-making principle by increasing the number of slot machines and lasers. This study suggests a new direction in laser network-based decision making for future photonic intelligent functions.
Efficient and accurate decision making is gaining increased importance with the rapid expansion of information communication technologies including artificial intelligence. Here, we propose and experimentally demonstrate an on-chip, integrated photonic decision maker based on a ring laser. The ring laser exhibits spontaneous switching between clockwise and counter-clockwise oscillatory dynamics; we utilize such nature to solve a multi-armed bandit problem. The spontaneous switching dynamics provides efficient exploration to find the accurate decision. On-line decision making is experimentally demonstrated including autonomous adaptation to an uncertain environment. This study paves the way for directly utilizing the fluctuating physics inherent in ring lasers, or integrated photonics technologies in general, for achieving or accelerating intelligent functionality.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.