Online misogyny is a pernicious social problem that risks making online platforms toxic and unwelcoming to women. We present a new hierarchical taxonomy for online misogyny, as well as an expert labelled dataset to enable automatic classification of misogynistic content. The dataset consists of 6,567 labels for Reddit posts and comments. As previous research has found untrained crowdsourced annotators struggle with identifying misogyny, we hired and trained annotators and provided them with robust annotation guidelines. We report baseline classification performance on the binary classification task, achieving accuracy of 0.93 and F1 of 0.43. The codebook and datasets are made freely available for future researchers.
Progress in genomics has enabled the emergence of a booming market for “direct-to-consumer” genetic testing. Nowadays, companies like 23andMe and AncestryDNA provide affordable health, genealogy, and ancestry reports, and have already tested tens of millions of customers. At the same time, alt- and far-right groups have also taken an interest in genetic testing, using them to attack minorities and prove their genetic “purity.” In this paper, we present a measurement study shedding light on how genetic testing is being discussed on Web communities in Reddit and 4chan. We collect 1.3M comments posted over 27 months on the two platforms, using a set of 280 keywords related to genetic testing. We then use NLP and computer vision tools to identify trends, themes, and topics of discussion. Our analysis shows that genetic testing attracts a lot of attention on Reddit and 4chan, with discussions often including highly toxic language expressed through hateful, racist, and misogynistic comments. In particular, on 4chan's politically incorrect board (/pol/), content from genetic testing conversations involves several alt-right personalities and openly antisemitic rhetoric, often conveyed through memes. Finally, we find that discussions build around user groups, from technology enthusiasts to communities promoting fringe political views.
Rapid advances in human genomics are enabling researchers to gain a better understanding of the role of the genome in our health and well-being, stimulating hope for more effective and cost efficient healthcare. However, this also prompts a number of security and privacy concerns stemming from the distinctive characteristics of genomic data. To address them, a new research community has emerged and produced a large number of publications and initiatives. In this paper, we rely on a structured methodology to contextualize and provide a critical analysis of the current knowledge on privacy-enhancing technologies used for testing, storing, and sharing genomic data, using a representative sample of the work published in the past decade. We identify and discuss limitations, technical challenges, and issues faced by the community, focusing in particular on those that are inherently tied to the nature of the problem and are harder for the community alone to address. Finally, we report on the importance and difficulty of the identified challenges based on an online survey of genome data privacy experts.
Recent progress in genomics has enabled the emergence of a flourishing market for direct-to-consumer (DTC) genetic testing. Companies like 23andMe and AncestryDNA provide affordable health, genealogy, and ancestry reports, and have already tested tens of millions of customers. Consequently, news, experiences, and views on genetic testing are increasingly shared and discussed on social media. At the same time, far-right groups have also taken an interest in genetic testing, using them to attack minorities and prove their genetic "purity. " In this paper, we set to study the genetic testing discourse on a number of mainstream and fringe Web communities. We do so in two steps. First, we conduct an exploratory, large-scale analysis of the genetic testing discourse on a mainstream social network such as Twitter. We find that the genetic testing discourse is fueled by accounts that appear to be interested in digital health and technology. However, we also identify tweets with highly racist connotations. This motivates us to explore the connection between genetic testing and racism on platforms with a reputation for toxicity, namely, Reddit and 4chan, where we find that discussions around genetic testing often include highly toxic language expressed through hateful and racist comments. In particular, on 4chan's politically incorrect board (/pol/), content from genetic testing conversations involves several alt-right personalities and openly anti-semitic rhetoric, often conveyed through memes.
Recent progress in genomics is bringing genetic testing to the masses. Participatory public initiatives are underway to sequence the genome of millions of volunteers, and a new market is booming with a number of companies like 23andMe and AncestryDNA offering affordable tests directly to consumers. Consequently, news, experiences, and views on genetic testing are increasingly shared and discussed online and on social networks like Twitter. In this paper, we present a large-scale analysis of Twitter discourse on genetic testing. We collect 302K tweets from 113K users, posted over 2.5 years, by using thirteen keywords related to genetic testing companies and public initiatives as search keywords. We study both the tweets and the users posting them along several axes, aiming to understand who tweets about genetic testing, what they talk about, and how they use Twitter for that. Among other things, we find that tweets about genetic testing originate from accounts that overall appear to be interested in digital health and technology. Also, marketing efforts as well as announcements, such as the FDA's suspension of 23andMe's health reports, influence the type and the nature of user engagement. Finally, we report on users who share screenshots of their results, and raise a few ethical and societal questions as we find evidence of groups associating genetic testing to racist ideologies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.