“…According to different data types, we divide the inputs of these approaches into two categories: time series data and image data. Among them, 1D time series were the most common input type, such as raw or preprocessed vibration signals [68], [73], [74], [84], [90], [92], [94], [97]- [99], [102], [109], [110], [120], [122], [127], [129] and frequency spectra [63]- [66], [76], [78], [82], [86], [88], [89], [93], [112], [114], [124], [126]. Other approaches used 2D images as inputs, and the images were mostly generated by signal-segment-stack [70], [87], [100], [105], [115], [121] and time-frequency representations (including short-time Fourier transform [116], wavelet transform [104], [106], S-transform [69], [117], [118]).…”