Geometric deep learning (GDL) generalizes convolutional neural networks (CNNs) to non-Euclidean domains. In this work, a GDL technique, allowing the application of CNN on graphs, is examined. It defines convolutional filters with the use of the Gaussian mixture model (GMM). As those filters are defined in continuous space, they can be easily rotated without the need for some additional interpolation. This, in turn, allows constructing systems having rotation equivariance property. The characteristic of the proposed approach is illustrated with the problem of ear detection, which is of great importance in biometric systems enabling image based, discrete human identification. The analyzed graphs were constructed taking into account superpixels representing image content. This kind of representation has several advantages. On the one hand, it significantly reduces the amount of processed data, allowing building simpler and more effective models. On the other hand, it seems to be closer to the conscious process of human image understanding as it does not operate on millions of pixels. The contributions of the paper lie both in GDL application area extension (semantic segmentation of the images) and in the novel concept of trained filter transformations. We show that even significantly reduced information about image content and a relatively simple, in comparison with classic CNN, model (smaller number of parameters and significantly faster processing) allows obtaining detection results on the quality level similar to those reported in the literature on the UBEAR dataset. Moreover, we show experimentally that the proposed approach possesses in fact the rotation equivariance property allowing detecting rotated structures without the need for labor consuming training on all rotated and non-rotated images.In this work, ear images are considered [2,3]. There are several factors that make the research and application of ear recognition important and attractive: people can be identified on that basis; this biometric does not change in time; and its gathering does not create a great deal of controversy. Furthermore, technology enables acquisition of ear images from a distance, which may be of great importance for police investigations and in security systems. It has, however, at least two consequences. Firstly, before the individual can be identified [4,5], the localization of the ear must be precisely detected. Secondly, since the acquisition process is not controlled, the orientation of the head can significantly vary. As a result, the need for detection methods arises, which will be able to cope with ear transformations.Convolutional neural networks (CNNs) became a state-of-the-art solution of many image analysis problems [6,7]. Their main component is convolutional layers designed to apply the same, trainable filters (represented by a rectangular mask) locally to every part of the image. It allows extracting the spatial distribution of image characteristic features (so-called feature maps). These kinds of layers together...