Convolutional neural networks are paramount in image and signal processing, and are responsible for the majority of image recognition power consumption today, concentrated mainly in convolution computations. With convolution operations being computationally intensive, next‐generation hardware accelerators need to offer parallelization and high efficiency. Diffractive optics offers the promise of low‐latency, highly parallel convolution operations. However, thus far parallelism is only partially harvested, thereby significantly underdelivering in comparison to its throughput potential. Here, a parallelized operation high‐throughput Fourier optic convolutional accelerator is demonstrated. For the first time, simultaneous processing of multiple kernels in Fourier domain enabled by optical diffraction orders is achieved alongside input parallelism. The proposed approach can offer ≈100× speedup over the previous generation optical diffraction‐based processor and 10× speedup over other state‐of‐the‐art optical Fourier classifiers.
Decision-making through artificial neural networks with minimal latency is critical for numerous applications such as navigation, tracking, and real-time machine action systems. This requires machine learning hardware to process multidimensional data at high throughput. Unfortunately, handling convolution operations, the primary computational tool for data classification tasks, obeys challenging runtime complexity scaling laws. However, homomorphically implementing the convolution theorem in a Fourier optics display light processor can achieve a non-iterative (1) runtime complexity for data inputs beyond 1,000 × 1,000 large matrices. Following this approach, here we demonstrate data streaming multi-kernel image batching using a Fourier Convolutional Neural Network (FCNN) accelerator. We show image batch processing of large-scale matrices as 2 million dot product multiplications performed by a digital light processing module in the Fourier domain. Furthermore, we further parallelize this optical FCNN system by exploiting multiple spatially parallel diffraction orders, achieving a 98x throughput improvement over state-of-the-art FCNN accelerators. A comprehensive discussion of the practical challenges associated with working at the edge of system capabilities highlights the problem of crosstalk and resolution scaling laws in the Fourier domain. Accelerating convolution by exploiting massive parallelism in display technology brings non-Van Neumann-based machine learning acceleration.
Neural Networks have been proven successful in many fields. Optical systems show potential for high-speed low-power Neural Networks. However, optical alignment is very demanding for wavelength-level coherent systems. Here we present Training-on-System methods to learn the imperfectly aligned system to increase the system’s performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.