Deep learning and artificial intelligence (AI) are now leading the way in improving both computing technology and human life. [1] The rapid advance on the neural network topology and learning algorithms has been astonishing in recent years. [2][3][4][5] However, the traditional computing hardware has not been able to keep pace with the high computational demands of AI processing, which has inspired the development of dedicated hardware such as the general-purpose graphic processing unit [6] and tensor processing unit for AI acceleration. [7] The traditional von-Neumann architectures are not ideal for data-massive AI processing because the frequent data transfer between the central processing unit and off-chip memory, e.g., dynamic random access memory (DRAM), significantly increases both energy consumption and latency. [8] In contrast, neuromorphic computing leverages the fundamental knowledge of neuroscience in designing better computing hardware that mimics both biological synapse and neuron operations in the inherently low-power human brain. [9] An artificial synapse should provide an adjustable and persistent weight value. Thus, popular candidates are various emerging memories, such as resistive-switching memory, phase-change memory (PCM), and magnetoresistance memory (MRAM), in a crossbar array configuration. Their implementations of neuromorphic computing systems have been reported widely. [10][11][12][13][14][15][16][17][18] The primary function of the artificial neuron is to integrate and process signals from the synaptic array and then convey the excitatory spike signals to the next neural layer as inputs. [19] Many complementary metal-oxidesemiconductor (CMOS)-based neuron circuits have been demonstrated previously. [19][20][21][22][23] Integrating both memory-based synaptic arrays and CMOS neuron circuits enables a low-power, highly parallel analog neuromorphic system. [24,25] However, the CMOS neuron circuits account for a large fraction of the total chip area because the integration and reset functions for generating spiking signals require a large number of transistors and large-area capacitors. [25] To improve the area efficiency thus the density in the next-generation neuromorphic hardware, several spiking neuron devices have been proposed using varieties of technologies, such as the MRAM, [26][27][28] ferroelectric field-effect transistor (FeFET), [29][30][31][32] threshold-switching (TS) device, [33][34][35][36][37][38][39] silicon-on-insulator MOSFET (SOI-MOSFET), [40,41] and PCM. [42] Several functional blocks in the CMOS neuron circuits could be replaced using a single device with equivalent functionality by leveraging intrinsic device characteristics. As a result, the area of the overall spiking neuron circuit could be effectively reduced. In addition to biologically inspired analog neuromorphic systems, a mixed analog-digital hardware implementation, the so-called in-memory computing (IMC), for accelerating AI based on the fundamental concepts of both neuromorphic computing (e.g., the synap...