“…This is largely due to the more complex nature of event mentions (i.e., a trigger and arguments) and their syntactic diversity (e.g., both verb phrases and noun-phrases). Prior work on event coreference typically involves pairwise scoring between mentions followed by a standard clustering algorithm to predict coreference links (Pandian et al, 2018;Choubey and Huang, 2017;Cremisini and Finlayson, 2020;Meged et al, 2020;Yu et al, 2020b;Cattan et al, 2020), classification over a fixed number of clusters (Kenyon-Dean et al, 2018) and template-based methods (Cybulska and Vossen, 2015b,a). While pairwise scoring (e.g., graphbased models, see §3.7) with clustering is effective, it requires tuned thresholds (for the clustering algorithm) and cannot use already predicted scores to inform later ones, since all scores are predicted independently.…”