Predictive Maintenance (PdM) is the newest strategy for maintenance management in industrial contexts. It aims to predict the occurrence of a failure to minimize unexpected downtimes and maximize the useful life of components. In data-driven approaches, PdM makes use of Machine Learning (ML) algorithms to extract relevant features from signals, identify and classify possible faults (diagnostics), and predict the components’ remaining useful life (prognostics). The major challenge lies in the high complexity of industrial plants, where both operational conditions change over time and a large number of unknown modes occur. A solution to this problem is offered by novelty detection, where a representation of the machinery normal operating state is learned and compared with online measurements to identify new operating conditions. In this paper, a systematic study of autoencoder-based methods for novelty detection is conducted. We introduce an architecture template, which includes a classification layer to detect and separate the operative conditions, and a localizer for identifying the most influencing signals. Four implementations, with different deep learning models, are described and used to evaluate the approach on data collected from a test rig. The evaluation shows the effectiveness of the architecture and that the autoencoders outperform the current baselines.
A rule-based entity matching task requires the definition of an effective set of rules, which is a time-consuming and error-prone process. The typical approach adopted for its resolution is a trial and error method, where the rules are incrementally added and modified until satisfactory results are obtained. This approach requires significant human intervention, since a typical dataset needs the definition of a large number of rules and possible interconnections that cannot be manually managed. In this paper, we propose TuneR, a software library supporting developers (i.e., coders, scientists, and domain experts) in tuning sets of matching rules. It aims to reduce human intervention by offering a tool for the optimization of rule sets based on user-defined criteria (such as effectiveness, interpretability, etc.). Our goal is to integrate the framework in the Magellan ecosystem, thus completing the functionalities required by the developers for performing Entity Matching tasks.
State-of-the-art Entity Matching (EM) approaches rely on transformer architectures, such as BERT , for generating highly contex-tualized embeddings of terms. The embeddings are then used to predict whether pairs of entity descriptions refer to the same real-world entity. BERT-based EM models demonstrated to be effective, but act as black-boxes for the users, who have limited insight into the motivations behind their decisions. In this paper, we perform a multi-facet analysis of the components of pre-trained and fine-tuned BERT architectures applied to an EM task. The main findings resulting from our extensive experimental evaluation are (1) the fine-tuning process applied to the EM task mainly modifies the last layers of the BERT components, but in a different way on tokens belonging to descriptions of matching / non-matching entities; (2) the special structure of the EM datasets, where records are pairs of entity descriptions is recognized by BERT; (3) the pair-wise semantic similarity of tokens is not a key knowledge exploited by BERT-based EM models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.