As a universal language, English has been paid more and more attention, among which oral English learning is very important. In this paper, the two key technologies of pronunciation error detection and quality evaluation are studied, both of them are effectively integrated, aiming to build a model for L2 learners’ English pronunciation quality evaluation. This paper mainly studies two different methods of pronunciation error detection. Based on the speech recognition framework, the standard score is compared with the threshold to judge the correctness of phoneme pronunciation, and the phoneme-dependent threshold is set to improve the maximum Precision to 0.44. By judging the correct pronunciation and confusing phoneme, the accuracy of pronunciation error detection is improved to 81.26%. This paper proposes the fusion algorithm from multi-dimensions of speech fluency and intonation respectively, and a newly designed feature called word duration ratio, which significantly improve the correlation of pronunciation quality evaluation to 0.746.