Synonyms extraction is a difficult task to achieve and evaluate. Some studies have tried to exploit general dictionaries for that purpose, seeing them as graphs where words are related by the definition they appear in, in a complex network of an arguably semantic nature. The advantage of using a general dictionary lies in the coverage, and the availability of such resources, in general and also in specialised domains. We present here a method exploiting such a graph structure to compute a distance between words. This distance is used to isolate candidate synonyms for a given word. We present an evaluation of the relevance of the candidates on a sample of the lexicon.
The paper presents a computational model aiming at making the morphological structure of the lexicon emerge from the formal and semantic regularities of the words it contains. The model is purely lexemebased. The proposed morphological structure consists of (1) binary relations that connect each headword with words that are morphologically related, and especially with the members of its morphological family and its derivational series, and of (2) the analogies that hold between the words. The model has been tested on the lexicon of French using the TLFi machine readable dictionary.
Bernard Fradin & Nabil Hathout : -ET suffixation and the question of productivity This article first sketches an analysis of the system of -et suffixation in French. The analysis shows that -ET suffixation involves approximatively fifteen patterns of lexeme formation organized around two poles (Referent and Speaker pole). Building on this analysis, the article examines the productivity of the derivational suffixes -et and -ette in the French newspaper Libération in a five years period. Formal tools provided by H. Baayen are used to gauge the degree of productivity of both affixes. Two partial results are worth mentioning: first, -ette's degree of productivity exceeds that of -et; this result supports the idea that these suffixes must not be considered as exponents of one and same morphological process. Second, not all derivational patterns are productive to the same degree, which means that it is meaningless to speak of the global degree of productivity of -et suffixation, except as the sum of the degrees of productivity obtained for each pattern. lexeme formation organized around two poles (Referent and Speaker pole). Building on this analysis, the article examines the productivity of the derivational suffixes -et and -ette in the French newspaper Libération in a five years period. Formal tools provided by H. Baayen are used to gauge the degree of productivity of both affixes. Two partial results are worth mentioning: first, -ette's degree of productivity exceeds that of -et; this result supports the idea that these suffixes must not be considered as exponents of one and same morphological process. Second, not all derivational patterns are productive to the same degree, which means that it is meaningless to speak of the global degree of productivity of -et suffixation, except as the sum of the degrees of productivity obtained for each pattern.
Selectional preference is defined as the tendency of a predicate to favour particular arguments within a certain linguistic context, and likewise, reject others that result in conflicting or implausible meanings. The stellar success of contextual word embedding models such as BERT in NLP tasks has led many to question whether these models have learned linguistic information, but up till now, most research has focused on syntactic information. We investigate whether BERT contains information on the selectional preferences of words, by examining the probability it assigns to the dependent word given the presence of a head word in a sentence. We are using word pairs of head-dependent words in five different syntactic relations from the SP-10K corpus of selectional preference (Zhang et al., 2019b), in sentences from the ukWaC corpus, and we are calculating the correlation of the plausibility score (from SP-10K) and the model probabilities. Our results show that overall, there is no strong positive or negative correlation in any syntactic relation, but we do find that certain head words have a strong correlation, and that masking all words but the head word yields the most positive correlations in most scenarios-which indicates that the semantics of the predicate is indeed an integral and influential factor for the selection of the argument.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.