In-memory Reinforcement Learning with Moderately-Stochastic Conductance Switching of Ferroelectric Tunnel Junctions

Berdan, Radu; Marukame, Takao; Kabuyanagi, Shoichi; Ota, Ken‐ichiro; Saitoh, Masumi; Fujii, Shosuke; Deguchi, Jun; Nishi, Yoshifumi

doi:10.23919/vlsit.2019.8776500

Cited by 36 publications

(17 citation statements)

References 3 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…[5][6][7] Alternatively, lower conductance nonlinear memristors, such ferroelectric based, have the potential for linear computation at ultralow currents towards 100 Gops/mW. 8 Optoelectronic memristors 9 have become promising candidates for artificial vision allowing temporary memory and real-time processing of visual information and sensory data. 6 However, challenges remain such as reliability, device-to-device variation, large-scale integration due to sophisticated fabrication and complex device architectures (rigid, costly), hindering memristive hardware from going mainstream.…”

Section: Introductionmentioning

confidence: 99%

Memristive perovskite solar cells towards parallel solar energy harvesting and processing-in-memory computing

et al. 2022

View full text Add to dashboard Cite

show abstract

Section: Introductionmentioning

confidence: 99%

Memristive perovskite solar cells towards parallel solar energy harvesting and processing-in-memory computing

et al. 2022

View full text Add to dashboard Cite

show abstract

“…As important as the hardware, the exploration of algorithms accelerate the development of large scale arrays and applications. The algorithms with relative relaxation of the requirements on the conductance precision are more suitable for memristor-based neuromorphic computing, or even employing the conductance imprecision to avoid the over-fitting in ANNs [255,256] or optimize the reinforcement learning and represent complex parameters in Bayesian regularization neural networks. A better match between precision and speed requires the algorithm of ex situ-trained ANNs to make a good balance between the time and energy during the iterative [51] Copyright 2015.…”

Section: Challenges Progress and Opportunities For Volatile And Nonvo...mentioning

confidence: 99%

Volatile and Nonvolatile Memristive Devices for Neuromorphic Computing

Zhou

Wang

Sun

et al. 2022

Adv Elect Materials

122

View full text Add to dashboard Cite

(5 of 33)www.advelectronicmat.de state after operation an external stimulation (Figure 2f). As the increasing stimulation, the transition state entering to the metallic state leads to the Mott layer with low resistance state (LRS) (Figure 2g). The insulator to metal transition is in a timescale of femtosecond and picosecond, [69] while from the opposite transition from the metal state to the insulator state Figure 3. Second-order memristor for temporal information simulation. a) Conception of the second-order memristor. Adapted with permission. [51] Copyright 2015, American Chemical Society. b) Schematic of an artificial neuron consisting of dendrites, soma, and axon constructed by the secondorder memristor circuit. c) The temporal summation of excitatory postsynaptic currents (EPSCs) for the frequency-dependency conductance evolution of 2nd memristor. Adapted with permission. [52] Copyright 2018, Wiley. d) The second-order memristor networks consist of 128 inputs and 7 outputs for temporal learning simulation, from up to bottom denotes before training state with random weights and different learned motion speeds. Adapted with permission. [49] Copyright 2017, IEEE. e) Transient temperature evolution with Δt = 1 µs and Δt = 100 ns. Adapted with permission.

show abstract

“…Multiple DNNs are used to observe the policy training procedure in the RL system. Moreover, a memristorbased reinforcement learning system is proposed in [19]. In [20], a 55nm time-domain mixed-signal (TD-MS) neuromorphic accelerator is proposed to perform the Q-Learning.…”

Section: Background a Reinforcement Learningmentioning

confidence: 99%

FARANE-Q: Fast Parallel and Pipeline Q-Learning Accelerator for Configurable Reinforcement Learning SoC

et al. 2023

View full text Add to dashboard Cite

This paper proposes a FAst paRAllel and pipeliNE Q-learning accelerator (FARANE-Q) for a configurable Reinforcement Learning (RL) algorithm that is implemented in a System on Chip (SoC). In order to overcome the challenges of a dynamic environment and increasing complexity, the proposed work offers flexibility, configurability, and scalability while maintaining computation speed and accuracy. The proposed method includes a HW/SW design methodology for the SoC architecture to achieve flexibility. Moreover, we also propose joint optimizations on algorithm, architecture and implementation in order to obtain optimum (high efficiency) performance, specifically in energy and area efficiency. Furthermore, we implemented the proposed design in a real-time Zynq Ultra96-V2 FPGA platform to evaluate the functionality with real use case of the smart navigation. Experimental results confirm that the proposed accelerator FARANE-Q outperforms state of the art works by achieving throughput up to 148.55 MSps. It corresponds to the energy efficiency of 1632.42 MSps/W per agent for 32-bit and 2465.42 MSps/W per agent for 16-bit FARANE-Q. Moreover, the proposed 16-bit FARANE-Q outperforms others in energy efficiency up to more than 2000×. The designed system also maintains the error accuracy less than 0.4% with optimized bit precision for more than 8 fraction bits. The proposed FARANE-Q also offers a speed up of processing time up to 1795× compared to embedded SW computation executed on ARM Zynq processor and 280× of computation of full software executed on i7 processor. Hence, the proposed work has the potential to be used for smart navigation, robotic control, and predictive maintenance.

show abstract

In-memory Reinforcement Learning with Moderately-Stochastic Conductance Switching of Ferroelectric Tunnel Junctions

Cited by 36 publications

References 3 publications

Memristive perovskite solar cells towards parallel solar energy harvesting and processing-in-memory computing

Memristive perovskite solar cells towards parallel solar energy harvesting and processing-in-memory computing

Volatile and Nonvolatile Memristive Devices for Neuromorphic Computing

FARANE-Q: Fast Parallel and Pipeline Q-Learning Accelerator for Configurable Reinforcement Learning SoC

Contact Info

Product

Resources

About