This paper presents a method that assists in maintaining a rule-based named-entity recognition and classification system. The underlying idea is to use a separate system, constructed with the use of machine learning, to monitor the performance of the rule-based system. The training data for the second system is generated with the use of the rule-based system, thus avoiding the need for manual tagging. The disagreement of the two systems acts as a signal for updating the rule-based system. The generality of the approach is illustrated by applying it to large corpora in two different languages: Greek and French. The results are very encouraging, showing that this alternative use of machine learning can assist significantly in the maintenance of rulebased systems.
This paper shows rst the problems raised by proper names in natural language processing. Second, it introduces the knowledge representation structure we use based on conceptual graphs. Then it explains the techniques which are used to process known and unknown proper names. At last, it gives the performance of the system and the further works we intend to deal with.
This paper presents a project, named VIDAR-19, able to extract automatically diseases from the CORD-19 dataset, and also diseases which might be considered as risk factors. The project relies on the ICD-11 classification of diseases maintained by the WHO. This nomenclature is used as a data source of the extraction mechanism, and also as the repository for the results. Developed for the COVID-19, the project has the ability to extract diseases at risk and to calculate relevant indicators. The outcome of the project is presented in a dashboard which enables the user to explore graphically diseases at risk which are put back in the classification hierarchy. Beyond the COVID-19, VIDAR has much broader applications and might be directly used for any corpus dealing with other pathologies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.