“…Relevant works. For word sense disambiguation, existing technologies are as follows: 1) use the word semantic information and the word itself to better cover similar words when the word context may easily lead to data sparsity [4,25,11,35,36,17]; 2) introduce the syntactic information of the context in order to obtain the correlation of the collocations more efficiently [24,33,36]; 3) adopt the neural network model for the word order instead of an unordered model such as the bag of words [34,9,28,27,10]; 4) apply various embedding methods such as word embedding [7], word sense embedding [3,32] and the context vector [18]; 5) utilize the topic model that identifies the contextual topic so that ambiguous words and inappropriate senses can be filtered out earlier [2] [12][30]; 6) employ a graphbased method with the random walk [19,30,26]; 7) use the probability weighted voting method with dynamic self-adaptation [13]; 8) use the gloss-augmented neural network (GAS) [14]; 9) apply the co-attention model (CAN) and hierarchical co-attention model (HCAN) [15]; 10) utilize the generative adversarial networks (WSD-GAN) to combine both supervised-based and knowledge-based approaches [5]; and 11) use SyntagNet which is a large-scale manually disambiguated lexicalsemantic combination resource [16].…”