“…Document-level Tagging Document-level tagging introduced more contextual features to improve the performance of tagging. Some early works introduced non-local information (Finkel et al, 2005;Krishnan and Manning, 2006) (Hu et al, 2020(Hu et al, , 2019, chemical NER (Luo et al, 2018), disease NER , and Chinese patent Xue, 2014, 2016). Compared with these works, instead of proposing a novel model, we focus on investigating when and why the larger-context training, as a general strategy, can work.…”