Spectral analysis using wavelets has proven useful for analyzing electroencephalographic (EEG) signals and identifying biomarkers in a clinical context. Over the past decade, Riemannian geometry has crystalized as a theoretical framework providing robust methods for modeling biomedical outcomes and brain function from multi-channel EEG recordings. Combining both approaches yields applications with higher interpretability and efficiency. Yet these approaches rely on handcrafted rules and sequential optimization. On the contrary, the growing deep learning (DL) literature provides end-to-end trainable models for processing raw EEG. Despite state-of- the-art performance on various prediction tasks, clinical impact remains a challenge, which is at least in part due to their lack of interpretability and interoperability with established neuroscience concepts. In this work we introduce a new lightweight neural network for processing raw EEG data building on wavelet transforms and Riemannian geometry. The proposed architecture termed GREEN (Gabor Riemann EEGNet) is benchmarked on five different prediction tasks (age, sex, eyes-closed detection, dementia diagnosis, EEG pathology) from three different datasets (TUAB, CAUEEG, TDBRAIN) encompassing data from more than 5000 human participants. This new architecture significantly outperformed state-of-the-art non-deep models on several of the benchmarks and performed favorably compared to large DL models on the CAU bench- mark using orders of magnitude fewer parameters. Computational experiments revealed how GREEN facilitates learning sparse representations without compromising performance. We further demonstrate that the modularity of the GREEN architecture allows computing classical measures of phase synchrony, such as pairwise phase-locking values which are found to convey information relevant for predicting dementia diagnosis. Furthermore, the learned wavelets can easily be interpreted as bandpass filters, therefore making this architecture explainable by de- sign. We illustrate this with a classical example of the Berger effect i.e. modulation of 8-10Hz power when closing the eyes. By integrating domain-knowledge in architecture choices, the pro- posed model benefits from desirable complexity-performance trade-off and learns interpretable representations of EEG. The source code is made publicly available.