This paper presents the first attempt to use word embeddings to predict the compositionality of multiword expressions. We consider both single-and multi-prototype word embeddings. Experimental results show that, in combination with a back-off method based on string similarity, word embeddings outperform a method using count-based distributional similarity. Our best results are competitive with, or superior to, state-of-the-art methods over three standard compositionality datasets, which include two types of multiword expressions and two languages.
We predict the compositionality of multiword expressions using distributional similarity between each component word and the overall expression, based on translations into multiple languages. We evaluate the method over English noun compounds, English verb particle constructions and German noun compounds. We show that the estimation of compositionality is improved when using translations into multiple languages, as compared to simply using distributional similarity in the source language. We further find that string similarity complements distributional similarity.
The quality of a document is affected by various factors, including grammaticality, readability, stylistics, and expertise depth, making the task of document quality assessment a complex one. In this paper, we explore this task in the context of assessing the quality of Wikipedia articles and academic papers. Observing that the visual rendering of a document can capture implicit quality indicators that are not present in the document text -such as images, font choices, and visual layoutwe propose a joint model that combines the text content with a visual rendering of the document for document quality assessment. Experimental results over two datasets reveal that textual and visual features are complementary, achieving state-of-the-art results.
In this paper, we apply various embedding methods to multiword expressions to study how well they capture the nuances of noncompositional data. Our results from a range of word-, character-, and document-level embbedings suggest that word2vec performs the best, followed by fastText and infersent. Moreover, we find that recently-proposed contextualised embedding models such as BERT and ELMo are not adept at handling noncompositionality in multiword expressions.
Leishmaniasis is a neglected parasitic protozoal disease that affects approximately 12 million people and represents a public health problem in Iran.The objectives of this study were to obtain the essential oil (EO) from Pulicaria vulgaris Gaertn. growing in Iran and to carry out in-vitro antileishmanial screening of the EO against promastigotes of Leishmania major and Leishmania infantum. The EO from the aerial parts of P. vulgaris was extracted by hydrodistillation.Serial dilutions of the EO were screened for in-vitro antileishmanial activity using 96-well microtiter plates. The P. vulgaris EO was active against the promastigote forms of L. major and L. infantum, with IC 50 values of 244.70 and 233.65 µg/mL, respectively. Pulicaria vulgaris EO may serve as an alternative or complementary treatment for leishmaniasis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.