Recommendation of scientific papers is a task aimed to support researchers in accessing relevant articles from a large pool of unseen articles. When writing a paper, a researcher focuses on the topics related to her/his scientific domain, by using a technical language. The core idea of this paper is to exploit the topics related to the researchers scientific production (authored articles) to formally define her/his profile; in particular we propose to employ topic modeling to formally represent the user profile, and language modeling to formally represent each unseen paper. The recommendation technique we propose relies on the assessment of the closeness of the language used in the researchers papers and the one employed in the unseen papers. The proposed approach exploits a reliable knowledge source for building the user profile, and it alleviates the cold-start problem, typical of collaborative filtering techniques. We also present a preliminary evaluation of our approach on the DBLP.
The machine learning methods like support vectors machines, hidden markov model and conditional random fields are the most used methods for implementing natural language processing systems. In this paper, we propose a machine learning approach that can be used for sequential labeling tasks like biological event extraction. Our biological event extraction approach uses Support Vector Machines (SVM) and a composite kernel function to identify triggers and to assign the corresponding arguments. Also, we use a number of features based on both syntactic and contextual information which were automatically learned from the training data.
Dépasser le biais du simple critère quantitatif : tel est l’enjeu pour l’évaluation de la recherche aujourd’hui. En effet, l’impact du travail d’un chercheur se mesure encore trop souvent au nombre d’articles dans lesquels ses propres articles ont été cités. Or dans sa tâche d’écriture, un auteur pose un œil critique sur chacun des articles qu’il cite ; il peut ne citer un article que pour le situer chronologiquement par rapport à l’état de l’art, et exprimer un avis neutre à son propos ; il peut aussi présenter les limites des méthodes énoncées dans l’article cité ou critiquer son processus d’expérimentation ; il peut enfin apprécier le travail de l’article cité, et l’utiliser pour la construction de son approche. C’est pour tenir compte de ces nuances et de ce contexte des citations (qui en soi constitue déjà une forme d’évaluation…) que l’équipe de data scientists de MyScienceWork a entraîné un algorithme Open Source et développé trois modèles d’analyse. Les informations extraites des publications via cet outil ouvrent de nouvelles perspectives d’analyse contextuelle des citations, en les assortissant d’étiquettes d’avis, de type et de sentiment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.