Vortex lattice and panel methods belong to a broad family of aerodynamic codes based on potential flow theory. They are used in preliminary aerodynamic studies in early stages of aircraft design where hundreds of thousands candidate configurations are analyzed. In this paper, we describe their efficient implementation on modern multiand many-core architectures. We show how to bridge the 'ninja gap', defined as the performance gap between an unoptimized C/C++ code and best optimized CPU code. We port the Vortex Lattice Method to a Graphics Processing Unit using the OpenACC standard. An elegant solution for implementation of data movements for C++ classes is also presented.