“…There are a variety of methods for constructing TSA datasets along a variety of domains, ranging from very specific (e.g., OMD (Shamma et al, 2009)) to general (e.g., SemEval 2013-2014 (Nakov et al, 2016)). While there is the popular Stanford Twitter corpus, constructed with noisy labellings (Go et al, 2009), the more common method of constructing TSA datasets relies on manual annotation (usually crowd-sourced) of tweet sentiment to establish gold-standard labellings according to a pre-defined set of possible label categories (often POSITIVE, NEGATIVE, and NEUTRAL) (Shamma et al, 2009;Speriosu et al, 2011;Thelwall et al, 2012;Saif et al, 2013;Nakov et al, 2016;Rosenthal et al, 2017).…”