“…The subjective evaluation method was commonly applied to evaluate TTS systems, and the most commonly used method was the Mean Opinion Score (MOS) test [88], [92], [93], [94], [99], [103], [104], [105], [61], [106], [108], [110], [112], [114], [115], [117], [118], [119]. Categorial estimation (CE) tests, preference test [102], [108], [117], DMOS test [108], and DRT tests [87], [96], [116] have also been used as subjective evaluation tests in some studies. Most studies measured intelligibility [91], [94], [95], [96], [99], [103], [104], [105], [61], [106], [109], [110], [112], [115], [116], [118], [120] and naturalness [94], [95], [99], [101],…”