Purpose
The purpose of this paper is to classify Chinese word semantic relations, which are synonyms, antonyms, hyponyms and meronymys.
Design/methodology/approach
Basically, four simple methods are applied, ontology-based, dictionary-based, pattern-based and morpho-syntactic method. The authors make good use of search engine to build lexical and semantic resources for dictionary-based and pattern-based methods. To improve classification performance with more external resources, they also classify the given word pairs in Chinese and in English at the same time by using machine translation.
Findings
Experimental results show that the approach achieved an average F1 score of 50.87 per cent, an average accuracy of 70.36 per cent and an average recall of 40.05 per cent over all classification tasks. Synonym and antonym classification achieved high accuracy, i.e. above 90 per cent. Moreover, dictionary-based and pattern-based approaches work effectively on final data set.
Originality/value
For many natural language processing (NLP) tasks, the step of distinguishing word semantic relation can help to improve system performance, such as information extraction and knowledge graph generation. Currently, common methods for this task rely on large corpora for training or dictionaries and thesauri for inference, where limitation lies in freely data access and keeping built lexical resources up-date. This paper builds a primary system for classifying Chinese word semantic relations by seeking new ways to obtain the external resources efficiently.
This paper studies automatic keyphrase extraction on social media. Previous works have achieved promising results on it, but they neglect human reading behavior during keyphrase annotating. The human attention is a crucial element of human reading behavior. It reveals the relevance of words to the main topics of the target text. Thus, this paper aims to integrate human attention into keyphrase extraction models. First, human attention is represented by the reading duration estimated from eye-tracking corpus. Then, we merge human attention with neural network models by an attention mechanism. In addition, we also integrate human attention into unsupervised models. To the best of our knowledge, we are the first to utilize human attention on keyphrase extraction tasks. The experimental results show that our models have significant improvements on two Twitter datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.