The most widely used technique for IP geolocation consists in building a database to keep the mapping between IP blocks and a geographic location. Several databases are available and are frequently used by many services and web sites in the Internet. Contrary to widespread belief, geolocation databases are far from being as reliable as they claim. In this paper, we conduct a comparison of several current geolocation databases -both commercial and free-to have an insight of the limitations in their usability.First, the vast majority of entries in the databases refer only to a few popular countries (e.g., U.S.). This creates an imbalance in the representation of countries across the IP blocks of the databases. Second, these entries do not reflect the original allocation of IP blocks, nor BGP announcements. In addition, we quantify the accuracy of geolocation databases on a large European ISP based on ground truth information. This is the first study using a ground truth showing that the overly fine granularity of database entries makes their accuracy worse, not better. Geolocation databases can claim country-level accuracy, but certainly not city-level.
For the first time since the establishment of TCP and UDP, the Internet transport layer is subject to a major change by the introduction of QUIC. Initiated by Google in 2012, QUIC provides a reliable, connection-oriented low-latency and fully encrypted transport. In this paper, we provide the first broad assessment of QUIC usage in the wild. We monitor the entire IPv4 address space since August 2016 and about 46% of the DNS namespace to detected QUIC-capable infrastructures. Our scans show that the number of QUIC-capable IPs has more than tripled since then to over 617.59 K. We find around 161K domains hosted on QUIC-enabled infrastructure, but only 15K of them present valid certificates over QUIC. Second, we analyze one year of traffic traces provided by MAWI, one day of a major European tier-1 ISP and from a large IXP to understand the dominance of QUIC in the Internet traffic mix. We find QUIC to account for 2.6% to 9.1% of the current Internet traffic, depending on the vantage point. This share is dominated by Google pushing up to 42.1% of its traffic via QUIC.
Today a spectrum of solutions are available for istributing content over the Internet, ranging from commercial CDNs to ISP-operated CDNs to content-provider-operated CDNs to peer-to-peer CDNs. Some deploy servers in just a few large data centers while others deploy in thousands of locations or even on millions of desktops. Recently, major CDNs have formed strategic alliances with large ISPs to provide content delivery network solutions. Such alliances show the natural evolution of content delivery today driven by the need to address scalability issues and to take advantage of new technology and business opportunities. In this paper we revisit the design and operating space of CDN-ISP collaboration in light of recent ISP and CDN alliances. We identify two key enablers for supporting collaboration and improving content delivery performance: informed end-user to server assignment and in-network server allocation. We report on the design and evaluation of a prototype system, NetPaaS, that materializes them. Relying on traces from the largest commercial CDN and a large tier-1 ISP, we show that NetPaaS is able to increase CDN capacity on-demand, enable coordination, reduce download time, and achieve multiple traffic engineering goals leading to a win-win situation for both ISP and CDN.
Content delivery systems constitute a major portion of today's Internet traffic. While they are a good source of revenue for Internet Service Providers (ISPs), the huge volume of content delivery traffic also poses a significant burden and traffic engineering challenge for the ISP. The difficulty is due to the immense volume of transfers, while the traffic engineering challenge stems from the fact that most content delivery systems themselves utilize a distributed infrastructure. They perform their own traffic flow optimization and realize this using the DNS system. While content delivery systems may, to some extent, consider the user's performance within their optimization criteria, they currently have no incentive to consider any of the ISP's constraints. As a consequence, the ISP has "lost control" over a major part of its traffic. To overcome this impairment, we propose a solution where the ISP offers a Provideraided Distance Information System (PaDIS). PaDIS uses information available only to the ISP to rank any client-host pair based on distance information, such as delay, bandwidth or number of hops.In this paper we show that the applicability of the system is significant. More than 70% of the HTTP traffic of a major European ISP can be accessed via multiple different locations. Moreover, we show that deploying PaDIS is not only beneficial to ISPs, but also to users. Experiments with different content providers show that improvements in download times of up to a factor of four are possible. Furthermore, we describe a high performance implementation of PaDIS and show how it can be deployed within an ISP.
Today, a large fraction of Internet traffic is originated by Content Delivery Networks (CDNs). To cope with increasing demand for content, CDNs have deployed massively distributed infrastructures. These deployments pose challenges for CDNs as they have to dynamically map end-users to appropriate servers without being fully aware of the network conditions within an Internet Service Provider (ISP) or the end-user location. On the other hand, ISPs struggle to cope with rapid traffic shifts caused by the dynamic server selection policies of the CDNs.The challenges that CDNs and ISPs face separately can be turned into an opportunity for collaboration. We argue that it is sufficient for CDNs and ISPs to coordinate only in server selection, not routing, in order to perform traffic engineering. To this end, we propose Content-aware Traffic Engineering (CaTE), which dynamically adapts server selection for content hosted by CDNs using ISP recommendations on small time scales. CaTE relies on the observation that by selecting an appropriate server among those available to deliver the content, the path of the traffic in the network can be influenced in a desired way. We present the design and implementation of a prototype to realize CaTE, and show how CDNs and ISPs can jointly take advantage of the already deployed distributed hosting infrastructures and path diversity, as well as the ISP detailed view of the network status without revealing sensitive operational information. By relying on tier-1 ISP traces, we show that CaTE allows CDNs to enhance the end-user experience while enabling an ISP to achieve several traffic engineering goals.
No abstract
A tracking ow is a ow between an end user and a Web tracking service. We develop an extensive measurement methodology for quantifying at scale the amount of tracking ows that cross data protection borders, be it national or international, such as the EU28 border within which the General Data Protection Regulation (GDPR) applies. Our methodology uses a browser extension to fully render advertising and tracking code, various lists and heuristics to extract well known trackers, passive DNS replication to get all the IP ranges of trackers, and state-of-the art geolocation. We employ our methodology on a dataset from 350 real users of the browser extension over a period of more than four months, and then generalize our results by analyzing billions of web tracking ows from more than 60 million broadband and mobile users from 4 large European ISPs. We show that the majority of tracking ows cross national borders in Europe but, unlike popular belief, are pretty well conned within the larger GDPR jurisdiction. Simple DNS redirection and PoP mirroring can increase national connement while sealing almost all tracking ows within Europe. Last, we show that cross boarder tracking is prevalent even in sensitive and hence protected data categories and groups including health, sexual orientation, minors, and others.
Recent studies show that a large fraction of Internet traffic is originated by Content Providers (CPs) such as content distribution networks and hyper-giants. To cope with the increasing demand for content, CPs deploy massively distributed server infrastructures. Thus, content is available in many network locations and can be downloaded by traversing different paths in a network. Despite the prominent server location and path diversity, the decisions on how to map users to servers by CPs and how to perform traffic engineering by ISPs, are independent. This leads to a lose-lose situation as CPs are not aware about the network bottlenecks nor the location of end-users, and the ISPs struggle to cope with rapid traffic shifts caused by the dynamic CP server selection process.In this paper we propose and evaluate Content-aware Traffic Engineering (CaTE), which dynamically adapts the traffic demand for content hosted on CPs by utilizing ISP network information and end-user location during the server selection process. This leads to a win-win situation because CPs are able to enhance their end-user to server mapping and ISPs gain the ability to partially influence the traffic demands in their networks. Indeed, our results using traces from a Tier-1 ISP show that a number of network metrics can be improved when utilizing CaTE.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.