“…These datasets are based on news corpora and Wikipedia, which are naturally coherent, well-structured, and rich in context. Global disambiguation models (Guo and Barbosa, 2014;Pershina et al, 2015;Globerson et al, 2016) leverage this coherency by jointly disambiguating all the mentions in a single document. However, domains such as webpage fragments, social media, or search queries, are often short, noisy, and less coherent; such domains lack the necessary contextual information for global methods to pay off, and present a more challenging setting in general.…”