This study proposes a highly effective data analytics approach to prevent cyber fraud on automatic speaker verification systems by classifying histograms of genuine and spoofed voice recordings. Our deep learning-based lightweight architecture advances the application of fake voice detection on embedded systems. It sets a new benchmark with a balanced accuracy of 95.64% and an equal error rate of 4.43%, contributing to adopting artificial intelligence technologies in organizational systems and technologies. As fake voice-related fraud causes monetary damage and serious privacy concerns for various applications, our approach improves the security of such services, being of high practical relevance. Furthermore, the post-hoc analysis of our results reveals that our model confirms image texture analysis-related findings of prior studies and discovers further voice signal features (i.e., textural and contextual) that can advance future work in this field.