LIBKGE 1 is an open-source PyTorch-based library for training, hyperparameter optimization, and evaluation of knowledge graph embedding models for link prediction. The key goals of LIBKGE are to enable reproducible research, to provide a framework for comprehensive experimental studies, and to facilitate analyzing the contributions of individual components of training methods, model architectures, and evaluation methods. LIBKGE is highly configurable and every experiment can be fully reproduced with a single configuration file. Individual components are decoupled to the extent possible so that they can be mixed and matched with each other. Implementations in LIBKGE aim to be as efficient as possible without leaving the scope of Python/Numpy/PyTorch. A comprehensive logging mechanism and tooling facilitates indepth analysis. LIBKGE provides implementations of common knowledge graph embedding models and training methods, and new ones can be easily added. A comparative study (Ruffinelli et al., 2020) showed that LIBKGE reaches competitive to state-of-the-art performance for many models with a modest amount of automatic hyperparameter tuning.
Knowledge bases contribute to many web search and mining tasks, yet they are often incomplete. To add missing facts to a given knowledge base, various embedding models have been proposed in the recent literature. Perhaps surprisingly, relatively simple models with limited expressiveness often performed remarkably well under today's most commonly used evaluation protocols. In this paper, we explore whether recent models work well for knowledge base completion and argue that the current evaluation protocols are more suited for question answering rather than knowledge base completion. We show that when focusing on a different prediction task for evaluating knowledge base completion, the performance of current embedding models is unsatisfactory even on datasets previously thought to be too easy. This is especially true when embedding models are compared against a simple rule-based baseline. This work indicates the need for more research into the embedding models and evaluation protocols for knowledge base completion.
Financial risk, defined as the chance to deviate from return expectations, is most commonly measured with volatility. Due to its value for investment decision making, volatility prediction is probably among the most important tasks in finance and risk management. Although evidence exists that enriching purely financial models with natural language information can improve predictions of volatility, this task is still comparably underexplored. We introduce PRoFET, the first neural model for volatility prediction jointly exploiting both semantic language representations and a comprehensive set of financial features. As language data, we use transcripts from quarterly recurring events, so-called "earnings calls"; in these calls, the performance of publicly traded companies is summarized and prognosticated by their management. We show that our proposed architecture, which models verbal context with an attention mechanism, significantly outperforms the previous state-of-the-art and other strong baselines. Finally, we visualize this attention mechanism on the token-level, thus aiding interpretability and providing a use case of PRoFET as a tool for investment decision support.
Open Information Extraction systems extract ("subject text", "relation text", "object text") triples from raw text. Some triples are textual versions of facts, i.e., non-canonicalized mentions of entities and relations. In this paper, we investigate whether it is possible to infer new facts directly from the open knowledge graph without any canonicalization or any supervision from curated knowledge. For this purpose, we propose the open link prediction task, i.e., predicting test facts by completing ("subject text", "relation text", ?) questions. An evaluation in such a setup raises the question if a correct prediction is actually a new fact that was induced by reasoning over the open knowledge graph or if it can be trivially explained. For example, facts can appear in different paraphrased textual variants, which can lead to test leakage. To this end, we propose an evaluation protocol and a methodology for creating the open link prediction benchmark OLPBENCH. We performed experiments with a prototypical knowledge graph embedding model for open link prediction. While the task is very challenging, our results suggests that it is possible to predict genuinely new facts, which can not be trivially explained.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.