Abstract. There has been encouraging progress on information flow control for programs in increasingly complex programming languages, tracking the propagation of information from input sources to output sinks. Yet, programs are typically deployed in an environment with rich APIs and powerful libraries, posing challenges for information flow control when the code for these APIs and libraries is either unavailable or written in a different language. This paper presents a principled approach to tracking information flow in the presence of libraries. With the goal to strike the balance between security and precision, we present a framework that explores the middle ground between the "shallow", signature-based modeling of libraries and the "deep", stateful approach, where library models need to be supplied manually. We formalize our approach for a core language, extend it with lists and higher-order functions, and establish soundness results with respect to the security condition of noninterference.
Filter lists play a large and growing role in protecting and assisting web users. The vast majority of popular filter lists are crowd-sourced, where a large number of people manually label resources related to undesirable web resources (e.g. ads, trackers, paywall libraries), so that they can be blocked by browsers and extensions.Because only a small percentage of web users participate in the generation of filter lists, a crowd-sourcing strategy works well for blocking either uncommon resources that appear on "popular" websites, or resources that appear on a large number of "unpopular" websites. A crowd-sourcing strategy will perform poorly for parts of the web with small "crowds", such as regions of the web serving languages with (relatively) few speakers.This work addresses this problem through the combination of two novel techniques: (i) deep browser instrumentation that allows for the accurate generation of request chains, in a way that is robust in situations that confuse existing measurement techniques, and (ii) an ad classifier that uniquely combines perceptual and page-context features to remain accurate across multiple languages.We apply our unique two-step filter list generation pipeline to three regions of the web that currently have poorly maintained filter lists: Sri Lanka, Hungary, and Albania. We generate new filter lists that complement existing filter lists. Our complementary lists block an additional 3,349 of ad and ad-related resources (1,771 unique) when applied to 6,475 pages targeting these three regions.We hope that this work can be part of an increased effort at ensuring that the security, privacy, and performance benefits of web resource blocking can be shared with all users, and not only those in dominant linguistic or economic regions. CCS CONCEPTS• Security and privacy → Human and societal aspects of security and privacy; • Information systems → Web applications.
Browser extensions enable rich experience for the users of today's web. Being deployed with elevated privileges, extensions are given the power to overrule web pages. As a result, web pages often seek to detect the installed extensions, sometimes for benign adoption of their behavior but sometimes as part of privacy-violating user fingerprinting. Researchers have studied a class of attacks that allow detecting extensions by probing for Web Accessible Resources (WARs) via URLs that include public extension IDs. Realizing privacy risks associated with WARs, Firefox has recently moved to randomize a browser extension's ID, prompting the Chrome team to plan for following the same path. However, rather than mitigating the issue, the randomized IDs can in fact exacerbate the extension detection problem, enabling attackers to use a randomized ID as a reliable fingerprint of a user. We study a class of extension revelation attacks, where extensions reveal themselves by injecting their code on web pages. We demonstrate how a combination of revelation and probing can uniquely identify 90% out of all extensions injecting content, in spite of a randomization scheme. We perform a series of large-scale studies to estimate possible implications of both classes of attacks. As a countermeasure, we propose a browser-based mechanism that enables control over which extensions are loaded on which web pages and present a proof of concept implementation which blocks both classes of attacks. Probing attack: Previous works [55], [53] have focused on non-behavioral detection, based on a browser extension's listed WARs. The WARs are public resources which can be fetched from the context of a web page using a predefined URL, consisting of a public extension ID (or Universally Unique Identifier (UUID)) and the path to that resource. With the predefined URL to fetch a WAR from an extension, a web page can mount a probing attack, designed to detect an extension by probing for WARs, since a response with the probed WAR indicates the corresponding extension is installed. This attack can be seen in Figure 1a where 1 denotes the requests made by the attacker to probe for an installed browser extension. If the browser extension is in the browser context, the attacker will get a response consisting of the requested WAR (denoted by 2). This attack can be magnified by probing for a set of browser extensions' resources, thereby enumerating
Dynamic information flow control is a promising technique for ensuring confidentiality and integrity of applications that manipulate sensitive information. While much progress has been made on increasingly powerful programming languages ranging from low-level machine languages to high-level languages for distributed systems, surprisingly little attention has been devoted to libraries and APIs. The state of the art is largely an all-or-nothing choice: either a shallow or deep library modeling approach. Seeking to break out of this restrictive choice, we formalize a general mechanism that tracks information flow for a language that includes higher-order functions, structured data types and references.A key feature of our approach is the model heap, a part of the memory, where security information is kept to enable the interaction between the labeled program and the unlabeled library. We provide a proof-of-concept implementation and report on experiments with a file system library. The system has been proved correct using Coq.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.