Complex conjunctions and determiners are often considered as pretokenized units in parsing. This is not always realistic, since they can be ambiguous. We propose a model for joint dependency parsing and multiword expressions identification, in which complex function words are represented as individual tokens linked with morphological dependencies. Our graphbased parser includes standard secondorder features and verbal subcategorization features derived from a syntactic lexicon.We train it on a modified version of the French Treebank enriched with morphological dependencies. It recognizes 81.79% of ADV+que conjunctions with 91.57% precision, and 82.74% of de+DET determiners with 86.70% precision.
Partition et topicalisation : il y en a « stabilisateur » de sujets et de topiques indéfinis 7Partition et topicalisation : il y en a « stabilisateur » de sujets et de topiques indéfinis 35
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.