“…The goal of speaker recognition is to recognize a speaker from the characteristics of voices (Bai, Zhang, & Chen, 0000;Poddar, Sahidullah, & Saha, 2017). Representing the speaker properties into low dimensional feature space is beneficial for many downstream tasks, and such compact representations used to distinguish speakers (speaker embedding) have been an attractive topic and is widely used in some studies, such as speaker identification (Park, Cho, Park, Kim, & Park, 2018), verification (Le & Odobez, 2018;Novoselov, Shulipa, Kremnev, Kozlov, & Shchemelinin, 2018;Snyder, Garcia-Romero, Povey, & Khudanpur, 2017), detection (McLaren, Castan, Nandwana, Ferrer, & Yilmaz, 2018), segmentation (Garcia-Romero, Snyder, Sell, Povey, & McCree, 2017;Wang, Downey, Wan, Mansfield and Moreno, 2018), and speaker dependent speech enhancement (Chuang, Wang, Hung, Tsao, & Fang, 2019;Gao et al, 2015).…”