Web tracking has been extensively studied over the last decade. To detect tracking, previous studies and user tools rely on filter lists. However, it has been shown that filter lists miss trackers. In this paper, we propose an alternative method to detect trackers inspired by analyzing behavior of invisible pixels. By crawling 84,658 webpages from 8,744 domains, we detect that third-party invisible pixels are widely deployed: they are present on more than 94.51% of domains and constitute 35.66% of all third-party images. We propose a fine-grained behavioral classification of tracking based on the analysis of invisible pixels. We use this classification to detect new categories of tracking and uncover new collaborations between domains on the full dataset of 4, 216, 454 third-party requests. We demonstrate that two popular methods to detect tracking, based on EasyList&EasyPrivacy and on Disconnect lists respectively miss 25.22% and 30.34% of the trackers that we detect. Moreover, we find that if we combine all three lists, 379, 245 requests originated from 8,744 domains still track users on 68.70% of websites.
The enforcement of the General Data Protection Regulation and the ePrivacy Directive relies upon auditing legal compliance of websites. Data controllers, as part of their accountability and transparency obligations, need to declare the purposes of cookies that they use in their websites. This leads to relevant questions such as: How should purposes be described according to the purpose specification principle? And how to ensure a scalable auditing, enabled by automated means, for legal compliance of cookie purposes?In this paper, we investigate the legal compliance of purposes for 20,218 third-party cookies. Surprisingly, only 12.85% of third-party cookies have a corresponding cookie policy where a cookie is even mentioned. Overall, we find out that purposes declared in cookie policies do not comply with the purpose specification principle in 95% of cases in our automatized audit. Finally, we provide recommendations on standardized specification of purposes following the recent draft recommendation of the French Data Protection Authority (CNIL) on cookies.
Stateful and stateless web tracking gathered much attention in the last decade, however they were always measured separately. To the best of our knowledge, our study is the first to detect and measure cookie respawning with browser and machine fingerprinting. We develop a detection methodology that allows us to detect cookies dependency on browser and machine features. Our results show that 1, 150 out of the top 30, 000 Alexa websites deploy this tracking mechanism. We find out that this technique can be used to track users across websites even when third-party cookies are deprecated. Together with a legal scholar, we conclude that cookie respawning with browser fingerprinting lacks legal interpretation under the GDPR and the ePrivacy directive, but its use in practice may breach them, thus subjecting it to fines up to 20 million e.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.