We propose a new shared task of semantic retrieval from legal texts, in which a so-called contract discovery is to be performed-where legal clauses are extracted from documents, given a few examples of similar clauses from other legal acts. The task differs substantially from conventional NLI and shared tasks on legal information extraction (e.g., one has to identify text span instead of a single document, page, or paragraph). The specification of the proposed task is followed by an evaluation of multiple solutions within the unified framework proposed for this branch of methods. It is shown that state-of-the-art pretrained encoders fail to provide satisfactory results on the task proposed. In contrast, Language Model-based solutions perform better, especially when unsupervised fine-tuning is applied. Besides the ablation studies, we addressed questions regarding detection accuracy for relevant text fragments depending on the number of examples available. In addition to the dataset and reference results, LMs specialized in the legal domain were made publicly available.
The process of ontology authoring is inseparably connected with the quality assurance phase. One can verify the maturity and correctness of a given ontology by evaluating how many competency questions give correct answers. Competency questions are defined as a set of questions expressed in natural language that the finished ontology should be able to answer to correctly. Although this method can easily indicate what is the development status of an ontology, one has to translate competency questions from natural language into an ontology query language. This task is very hard and time consuming. To overcome this problem, my PhD thesis focuses on methods for automatically checking answerability of competency questions for a given ontology and proposing SPARQL-OWL query (OWL-aware SPARQL query) for each question where it is possible to create the query. Because the task of automatic translation from competency questions to SPARQL-OWL queries is a novel one, besides a method, we have proposed a new benchmark to evaluate such translation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.