Abstract:Internet censorship artificially changes the dynamics of resource production and consumption, affecting a range of stakeholders that include end users, service providers, and content providers. We analyze two large-scale censorship events in Pakistan: blocking of pornographic content in 2011 and of YouTube in 2012. Using traffic datasets collected at home and SOHO networks before and after the censorship events, we: a) quantify the demand for blocked content, b) illuminate challenges encountered by service pro… Show more
“…Social media in particular stands out, consistent with external evidence that this is increasingly seen as a threat by censors [23]. Blocking access to video-sharing and other entertainment sites may also be meant to suppress copyright infringement and/or support local businesses over global incumbents [37].…”
Section: Topic Correlationsmentioning
confidence: 59%
“…In some cases, it has been possible to identify the specific "filter" in use [20,34]. Another line aims to understand what is censored and why [1], how that changes over time [3,28], how the degree of censorship might vary within a country [68], and how people react to censorship [37,38].…”
Section: Previous Workmentioning
confidence: 99%
“…A few studies have dug into "leaks" of inside information, which allow researchers to see what actually is censored and perhaps why. Recently this has occurred for backbone filters in Syria [15] and Pakistan [37], and the TOM-Skype chat client's internal keyword filter [38]. However, all of these studies still took the list as a means to an end, not an object of research in itself.…”
Studies of Internet censorship rely on an experimental technique called probing. From a client within each country under investigation, the experimenter attempts to access network resources that are suspected to be censored, and records what happens. The set of resources to be probed is a crucial, but often neglected, element of the experimental design. We analyze the content and longevity of 758,191 webpages drawn from 22 different probe lists, of which 15 are alleged to be actual blacklists of censored webpages in particular countries, three were compiled using a priori criteria for selecting pages with an elevated chance of being censored, and four are controls. We find that the lists have very little overlap in terms of specific pages. Mechanically assigning a topic to each page, however, reveals common themes, and suggests that handcurated probe lists may be neglecting certain frequentlycensored topics. We also find that pages on controversial topics tend to have much shorter lifetimes than pages on uncontroversial topics. Hence, probe lists need to be continuously updated to be useful. To carry out this analysis, we have developed automated infrastructure for collecting snapshots of webpages, weeding out irrelevant material (e.g. site "boilerplate" and parked domains), translating text, assigning topics, and detecting topic changes. The system scales to hundreds of thousands of pages collected.
“…Social media in particular stands out, consistent with external evidence that this is increasingly seen as a threat by censors [23]. Blocking access to video-sharing and other entertainment sites may also be meant to suppress copyright infringement and/or support local businesses over global incumbents [37].…”
Section: Topic Correlationsmentioning
confidence: 59%
“…In some cases, it has been possible to identify the specific "filter" in use [20,34]. Another line aims to understand what is censored and why [1], how that changes over time [3,28], how the degree of censorship might vary within a country [68], and how people react to censorship [37,38].…”
Section: Previous Workmentioning
confidence: 99%
“…A few studies have dug into "leaks" of inside information, which allow researchers to see what actually is censored and perhaps why. Recently this has occurred for backbone filters in Syria [15] and Pakistan [37], and the TOM-Skype chat client's internal keyword filter [38]. However, all of these studies still took the list as a means to an end, not an object of research in itself.…”
Studies of Internet censorship rely on an experimental technique called probing. From a client within each country under investigation, the experimenter attempts to access network resources that are suspected to be censored, and records what happens. The set of resources to be probed is a crucial, but often neglected, element of the experimental design. We analyze the content and longevity of 758,191 webpages drawn from 22 different probe lists, of which 15 are alleged to be actual blacklists of censored webpages in particular countries, three were compiled using a priori criteria for selecting pages with an elevated chance of being censored, and four are controls. We find that the lists have very little overlap in terms of specific pages. Mechanically assigning a topic to each page, however, reveals common themes, and suggests that handcurated probe lists may be neglecting certain frequentlycensored topics. We also find that pages on controversial topics tend to have much shorter lifetimes than pages on uncontroversial topics. Hence, probe lists need to be continuously updated to be useful. To carry out this analysis, we have developed automated infrastructure for collecting snapshots of webpages, weeding out irrelevant material (e.g. site "boilerplate" and parked domains), translating text, assigning topics, and detecting topic changes. The system scales to hundreds of thousands of pages collected.
“…In another instance, a hacktivist group leaked 600 gigabytes of log files of Internet filtering devices used in Syria, allowing researchers to gain insights into censorship in that country [27]. Similarly, an anonymous ISP in Pakistan provided researchers access to a trove of data that enabled analysis of Pakistani censorship [28].…”
“…The results showed that in both events a significant increase in encrypted traffic occurred shortly after the imposition of the censorship. Additionally, a notable drop in the use of the local DNS resolvers arose, indicating that different anti-censorship techniques were employed [19].…”
Anti-censorship applications are becoming increasingly popular mean to circumvent Internet censorship, whether imposed by governments seeking to control the flow of information available to their citizens, or by parental figures wishing to shield their "parishioners" from the dangers of the Internet, or in organizations trying to restrict the Internet usage within their networking territory. Numerous applications are readily accessible for the average user to aid-in bypassing Internet censorship. Several technologies and techniques are associated with the formation of these applications, whereas, each of these applications deploys its unique mechanism to circumvent Internet censorship. Using anti-censorship applications in the work environment can have a negative impact on the network, leading to excessive degradation in the bandwidth, immense consumption of the Internet data usage capacity and possibly open the door for security breaches. Triumphing the war on anti-censorship applications has become more difficult to achieve at the network level due to the rapid updates and the adopted new technologies to circumvent censorship. In this study, a comprehensive overview on Internet censorship and anti-censorship applications is provided by analyzing Ultrasurf behavior, classifying its traffic patterns and proposing a behavioral-based solution that is capable of detecting and preventing the Ultrasurf traffic at the network level.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.