This paper proposes a FAst paRAllel and pipeliNE Q-learning accelerator (FARANE-Q) for a configurable Reinforcement Learning (RL) algorithm that is implemented in a System on Chip (SoC). In order to overcome the challenges of a dynamic environment and increasing complexity, the proposed work offers flexibility, configurability, and scalability while maintaining computation speed and accuracy. The proposed method includes a HW/SW design methodology for the SoC architecture to achieve flexibility. Moreover, we also propose joint optimizations on algorithm, architecture and implementation in order to obtain optimum (high efficiency) performance, specifically in energy and area efficiency. Furthermore, we implemented the proposed design in a real-time Zynq Ultra96-V2 FPGA platform to evaluate the functionality with real use case of the smart navigation. Experimental results confirm that the proposed accelerator FARANE-Q outperforms state of the art works by achieving throughput up to 148.55 MSps. It corresponds to the energy efficiency of 1632.42 MSps/W per agent for 32-bit and 2465.42 MSps/W per agent for 16-bit FARANE-Q. Moreover, the proposed 16-bit FARANE-Q outperforms others in energy efficiency up to more than 2000×. The designed system also maintains the error accuracy less than 0.4% with optimized bit precision for more than 8 fraction bits. The proposed FARANE-Q also offers a speed up of processing time up to 1795× compared to embedded SW computation executed on ARM Zynq processor and 280× of computation of full software executed on i7 processor. Hence, the proposed work has the potential to be used for smart navigation, robotic control, and predictive maintenance.