This paper describes a 3D computer architecture designed to achieve the lowest possible power consumption for "embedded applications" like radar and signal processing. It introduces several unique concepts including a low-power SIMD tile, low-power 3D memories, and 3D and 2.5D interconnect that is circuit switched so it can be tuned at run-time for a specific application. When conservatively projected to the 7 nm node, simulations of the architecture show potential for exceeding 75 GFLOPS/W, about 20x better than today's CPUs and GPUs. This translates to 13 pJ/FLOP. This paper will focus on the 3D specific aspects of the design. This architecture is highly suited to DSP and multimedia workflows.