“…Our first dataset consists of a collection of scientific publication datasets, namely KP20K, IN S P E C, KR A P I V I N, NUS, and SE MEV A L, that have been widely used in existing literature (Meng et al, 2017;Chen et al, 2018a;Ye and Wang, 2018;Chen et al, 2018b;Chan et al, 2019;Zhao and Zhang, 2019;Chen et al, 2019a;Sun et al, 2019). KP20K, for example, was introduced by Meng et al (2017) and comprises more than half a million scientific publications.…”