It has been almost a decade since FinFET devices were introduced to full production; they allowed scaling below 20 nm, thus helping to extend Moore's law by a precious decade with another decade likely in the future when scaling to 5 nm and below. Due to superior electrical parameters and unique structure, these 3-D transistors offer significant performance improvements and power reduction compared to planar CMOS devices. As we are entering into the sub-10 nm era, FinFETs have become dominant in most of the high-end products; as the transition from planar to FinFET technologies is still ongoing, it is important for digital circuit designers to understand the challenges and opportunities brought in by the new technology characteristics. In this paper, we study these aspects from the device to the circuit level, and we make detailed comparisons across multiple technology nodes ranging from conventional bulk to advanced planar technology nodes such as Fully Depleted Silicon-on-Insulator (FDSOI), to FinFETs. In the simulations we used both state-of-art industry-standard models for current nodes, and also predictive models for future nodes. Our study shows that besides the performance and power benefits, FinFET devices show significant reduction of short-channel effects and extremely low leakage, and many of the electrical characteristics are close to ideal as in old long-channel technology nodes; FinFETs seem to have put scaling back on track! However, the combination of the new device structures, double/multi-patterning, many more complex rules, and unique thermal/reliability behaviors are creating new technical challenges. Moving forward, FinFETs still offer a bright future and are an indispensable technology for a wide range of applications from high-end performance-critical computing to energy-constraint mobile applications and smart Internet-of-Things (IoT) devices.
We propose a novel architecture to enable low-power, complex on-node data processing, for the next generation of sensors for the internet of things (IoT), smartdust, or edge intelligence. Our architecture combines near-analog-memory-computing (NAM) and asynchronous-computing-with-streams (ACS), eliminating the need for ADCs. ACS enables ultra-low power, massive computational resources required to execute on-node complex Machine Learning (ML) algorithms; while NAM addresses the memory-wall that represents a common bottleneck for ML and other complex functions. In ACS an analog value is mapped to an asynchronous stream that can take one of two logic levels ( v h , v l ). This stream-based data representation enables area/power-efficient computing units such as a multiplier implemented as an AND gate yielding savings in power of ∼90% compared to digital approaches. The generation of streams for NAM and ACS in a brute force manner, using analog-to-digital-converters (ADCs) and digital-to-streams-converters, would sky-rocket the power-latency-energy cost making the approach impractical. Our NAM-ACS architecture eliminates expensive conversions, enabling an end-to-end processing on asynchronous streams data-path. We tailor the NAM-ACS architecture for random forest (RaF), an ML algorithm, chosen for its ability to classify using a reduced number of features. Simulations show that our NAM-ACS architecture enables 75% of savings in power compared with a single ADC, obtaining a classification accuracy of 85% using an RaF-inspired algorithm.
Superconducting technology is a prime candidate for the future of computing. However, current superconducting prototypes are limited to small-scale examples due to stringent area constraints and complex architectures inspired from voltage-level encoding in CMOS; this is at odds with the 𝑝𝑠-wide Single Quantum Flux (SFQ) pulses used in superconductors to carry information. In this work, we propose a wave-pipelined Unary SFQ (U-SFQ) architecture that leverages the advantages of two data representations: pulse-streams and Race Logic (RL). We introduce novel building blocks such as multipliers, adders, and memory cells, which leverage the natural properties of SFQ pulses to mitigate area constraints. We then design and simulate three popular hardware accelerators: i) a Processing Element (PE), typically used in spatial architectures; ii) A dot-product-unit (DPU), one of the most popular accelerators in artificial neural networks and digital signal processing (DSP); and iii) A Finite Impulse Response (FIR) filter, a popular and computationally demanding DSP accelerator. The proposed U-SFQ building blocks require up to 200× fewer JJs compared to their SFQ binary counterparts, exposing an area-delay trade-off. This work mitigates the stringent area constraints of superconducting technology. CCS CONCEPTS• Hardware → Emerging technologies; • Theory of computation → Modal and temporal logics; • Computer systems organization → Architectures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.