Factory machinery is prone to failure or breakdown, resulting in significant expenses for companies. Hence, there is a rising interest in machine monitoring using different sensors including microphones. In the scientific community, the emergence of public datasets has led to advancements in acoustic detection and classification of scenes and events, but there are no public datasets that focus on the sound of industrial machines under normal and anomalous operating conditions in real factory environments. In this paper, we present a new dataset of industrial machine sounds that we call a sound dataset for malfunctioning industrial machine investigation and inspection (MIMII dataset). Normal sounds were recorded for different types of industrial machines (i.e., valves, pumps, fans, and slide rails), and to resemble a real-life scenario, various anomalous sounds were recorded (e.g., contamination, leakage, rotating unbalance, and rail damage). The purpose of releasing the MIMII dataset is to assist the machine-learning and signalprocessing community with their development of automated facility maintenance.
A dereverberation technique has been developed that optimally combines multichannel inverse filtering (MIF), beamforming (BF), and non-linear reverberation suppression (NRS). It is robust against acoustic transfer function (ATF) fluctuations and creates less distortion than the NRS alone. The three components are optimally combined from a probabilistic perspective using a unified likelihood function incorporating two probabilistic models. A multichannel probabilistic source model based on a recently proposed local Gaussian model (LGM) provides robustness against ATF fluctuations of the early reflection. A probabilistic reverberant transfer function model (PRTFM) provides robustness against ATF fluctuations of the late reverberation. The MIF and multichannel under-determined source separation (MUSS) are optimized in an iterative manner. The MIF is designed to reduce the time-invariant part of the late reverberation by using optimal time-weighting with reference to the PRTFM and the LGM. The MUSS separates the dereverberated speech signal and the residual reverberation after the MIF, which can be interpreted as an optimized combination of the BF and the NRS. The parameters of the PRTFM and the LGM are optimized based on the MUSS output. Experimental results show that the proposed method is robust against the ATF fluctuations under both single and multiple source conditions.
As the labor force decreases, the demand for labor-saving automatic anomalous sound detection technology that conducts maintenance of industrial equipment has grown. Conventional approaches detect anomalies based on the reconstruction errors of an autoencoder. However, when the target machine sound is non-stationary, a reconstruction error tends to be large independent of an anomaly, and its variations increased because of the difficulty of predicting the edge frames. To solve the issue, we propose an approach to anomalous detection in which the model utilizes multiple frames of a spectrogram whose center frame is removed as an input, and it predicts an interpolation of the removed frame as an output. Rather than predicting the edge frames, the proposed approach makes the reconstruction error consistent with the anomaly. Experimental results showed that the proposed approach achieved 27% improvement based on the standard AUC score, especially against non-stationary machinery sounds.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.