A health diagnosis mechanism of rolling element bearings is necessary since the most frequent faults in rotating electrical machines occur in the bearing parts. Recently, convolutional neural networks (CNNs) have redefined the state-of-the-art accuracy for bearing fault detection and identification, extracting location invariant feature mappings without the need for prior expert knowledge. With the use of convolution operations as the core of the process, CNNs consider the local spatial coherence of the input. However, one major drawback of the convolutional models is the weakness to capture global information about the input vector and to derive knowledge about the statistical properties of the latter. The authors propose a deep learning (DL) model that concatenates the features that are produced from two neural streams. Each consists of an attention mechanism that intends to learn different representations of the input vector, and so finally to produce a feature mapping that contains global and spatial locally information. Simulation results on two famous rolling element bearings fault detection benchmarks show the effectiveness of the method. In particular, the proposed DL model achieves 99.60% in the Case Western Reserve University bearing data set and 99.10% in the Paderborn University bearing data set. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.