Recently, deep learning has shown substantial breakthroughs in various fields such as speech recognition, image and video classification, and natural language processing. [1-3] The explosive development of deep learning has promoted the convergence of this field with other disciplines. The progress has benefited from the update and improvement of models and theories in computer science, as well as the advancement of contemporary semiconductor chip technology. However, the limited bandwidth and computing resources of the traditional computer system greatly restrict the running speed when faced with the increasing scale of deep neural networks (DNNs). The traditional von Neumann architecture separates data storage and computing. Frequent and inefficient movement of data between the processor and memory or off-chip storage brings latency and energy consumption issues, while the mismatch between data transmission and data processing becomes a bottleneck in the implementation of deep learning in hardware. Due to the high-bandwidth and highparallelism requirements of deep learning, data-intensive artificial intelligence (AI) applications have been dominated by cloud computing; that is, edge devices act as data-collecting interfaces and pass data to clustered cloud computer centers for computing to achieve deep learning. [4] Such AI applications place high requirements on network bandwidth and latency, and take the privacy leakage issue to users. [5] For example, in areas with poor network conditions, Tesla's AI autonomous driving will become unreliable and even life-threatening. With the popularization of deep learning, the efficient AI applications that can be seen daily are becoming an urgent need. Edge intelligence is a concept relative to cloud intelligence. [6] Edge computing requires real-time intelligence on devices with strict budgets for energy consumption and device area, such as smart watches and drones. It pushes cloud services from the network core to the edge of the network that is closer to Internet-of-things (IoT) devices and data sources, and then builds up an end-to-end network. Physical proximity to the informationgeneration sources is the most crucial characteristic emphasized by edge computing, wherefore high energy efficiency, small size, low latency, and high privacy protection become valued characteristics for edge intelligence. [7-9] With the combination of hardware and AI, devices dedicated for deep learning have emerged. These devices are called neural network accelerators. The combination of a traditional complementary metal-oxide semiconductor (CMOS) and emerging nonvolatile memory provides a considerable wealth of possibilities for AI accelerators. [10-13] The use of memory technology as a synaptic weight matrix storage unit has set a foundation for the hardware implementation of neuromorphic computing systems. In some prominent AI chips, traditional memories have been utilized; for example,