“…With the development of large-scale summarization datasets such as CNN/DailyMail (Hermann et al, 2015), NYT (Sandhaus, 2008), Newsroom (Grusky et al, 2018) and XSum (Narayan et al, 2018a), along with advancements in deep neural-based architectures, modern supervised neural network-based methods that employ encoderdecoder framework have become increasingly popular. These models have been proposed with extractive strategies (Cheng and Lapata, 2016;Nallapati et al, 2017;Wu and Hu, 2018;Dong et al, 2018;Zhou et al, 2018;Narayan et al, 2018b); abstractive strategies (See et al, 2017;Chen and Bansal, 2018;Gehrmann et al, 2018;Zhang et al, 2019a;Lewis et al, 2019); and hybrid strategies (Hsu et al, 2018;Bae et al, 2019;Moroshko et al, 2019).…”