This paper presents an efficient architecture for performing 128 points to 1M points Fast Fourier Transformation (FFT) based on mixed radix-2/4/8 butterfly unit. The proposed FFT architecture reduces the computation cost by taking the advantage of the radix-8 FFT algorithm while remaining compatible with sequences whose data length is an integral power of 2. Further optimizations for reconfigurable application specified processor are developed. First, we propose a separated radix-2/4/8 butterfly unit which is more flexible than an entire radix-2/4/8 butterfly unit; second, for the sequences longer than 256K points, an efficient 2-epoch FFT solution is realized. This FFT architecture is implemented in a reconfigurable application specified processor. The computation time of our architecture is 676 us and 14.8 ms for 128K and 1M points FFTs respectively.