We present SetExpander, the term set expansion system based NLP Architect by Intel AI Lab. SetExpander is a corpus-based system for expanding a seed set of terms into a more complete set of terms that belong to the same semantic class. It implements an iterative endto-end workflow and enables users to easily select a seed set of terms, expand it, view the expanded set, validate it, re-expand the validated set and store it, thus simplifying the extraction of domain-specific fine-grained semantic classes. SetExpander has been used successfully in real-life use cases including integration into an automated recruitment system and an issues and defects resolution system. 1
In recent years, large pre-trained models have demonstrated state-of-the-art performance in many NLP tasks. However, the deployment of these models on devices with limited resources is challenging due to the models' large computational consumption and memory requirements. Moreover, the need for a considerable amount of labeled training data also hinders real-world deployment scenarios. Model distillation has shown promising results for reducing model size, computational load and data efficiency. In this paper we test the boundaries of BERT model distillation in terms of model compression, inference efficiency and data scarcity. We show that classification tasks that require the capturing of general lexical semantics can be successfully distilled by very simple and efficient models and require relatively small amount of labeled training data. We also show that the distillation of large pretrained models is more effective in real-life scenarios where limited amounts of labeled training are available.
While large language models à la BERT are used ubiquitously in NLP, pretraining them is considered a luxury that only a few wellfunded industry labs can afford. How can one train such models with a more modest budget? We present a recipe for pretraining a masked language model in 24 hours using a single lowend deep learning server. We demonstrate that through a combination of software optimizations, design choices, and hyperparameter tuning, it is possible to produce models that are competitive with BERT BASE on GLUE tasks at a fraction of the original pretraining cost. 1
While large language models à la BERT are used ubiquitously in NLP, pretraining them is considered a luxury that only a few wellfunded industry labs can afford. How can one train such models with a more modest budget? We present a recipe for pretraining a masked language model in 24 hours, using only 8 low-range 12GB GPUs. We demonstrate that through a combination of software optimizations, design choices, and hyperparameter tuning, it is possible to produce models that are competitive with BERT BASE on GLUE tasks at a fraction of the original pretraining cost.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.