“…Prior work focuses on extracting temporal relations between event pairs (a.k.a., TLINKS) present in the same sentence (Intra-sentence TLINKS) or adjacent sentences (Inter-sentence TLINKS), mostly ignoring document-level pairs (Crossdocument TLINKS) (Reimers et al, 2016). Past works have used RNN (Cheng and Miyao, 2017;Meng et al, 2017;Goyal and Durrett, 2019;Ning et al, 2019;Han et al, 2019aHan et al, ,c,b, 2020b and Transformer networks (Ballesteros et al, 2020;Zhao et al, 2020b) for encoding a few sentences or a short paragraph but do not capture longrange dependencies and multi-hop reasoning at the document-level. This shortcoming is shown in the TDDiscourse dataset (Naik et al, 2019), which was designed to highlight global discourse-level challenges, e.g., multi-hop chain reasoning, future or hypothetical events, and reasoning requiring world knowledge.…”