The amount of omics data in the public domain is increasing every year [1, 2]. Public availability of datasets is growing in all disciplines, because it is considered to be a good scientific practice (e.g. to enable reproducibility), and/or it is mandated by funding agencies, scientific journals, etc. Science is now a data intensive discipline and therefore, new and innovative ways for data management, data sharing, and for discovering novel datasets are increasingly required [3, 4]. However, as data volumes grow, quantifying its impact becomes more and more important. In this context, the FAIR (Findable, Accessible, Interoperable, Reusable) principles have been developed to promote good scientific practises for scientific data and data resources [5]. In fact, recently, several resources [1, 2, 6] have been created to facilitate the Findability (F) and Accessibility (A) of biomedical datasets. These principles put a specific emphasis on enhancing the ability of both individuals and software to discover and re-use digital objects in an automated fashion throughout their entire life cycle [5]. While data resources typically assign an equal relevance to all datasets (e.g. as results of a query), the usage patterns of the data can vary enormously, similarly to other "research products" such as publications. How do we know which datasets are getting more attention? More generally, how can we quantify the scientific impact of datasets?Recently, several authors [7][8][9] and resources [10] pointed out the importance of evaluating the impact of each research product, including datasets.