This paper shows that online political discussion networks are, on average, wider and deeper than the networks generated by other types of discussions: they engage a larger number of participants and cascade through more levels of nested comments. Using data collected from the Slashdot forum, this paper reconstructs the discussion threads as hierarchical networks and proposes a model for their comparison and classification. In addition to the substantive topic of discussion, which corresponds to the different sections of the forum (such as Developers, Games, or Politics), we classify the threads according to structural features like the maximum number of comments at any level of the network (i.e. the width) and the number of nested layers in the network (i.e. the depth). We find that political discussion networks display a tendency to cluster around the area that corresponds to wider and deeper structures, showing a significant departure from the structure exhibited by other types of discussions. We propose using this model to create a framework that allows the analysis and comparison of different internet technologies for the promotion of political deliberation.
This article describes in detail an n-gram approach to statistical machine translation. This approach consists of a log-linear combination of a translation model based on n-grams of bilingual units, which are referred to as tuples, along with four specific feature functions. Translation\ud
performance, which happens to be in the state of the art, is demonstrated with Spanish-to-English and English-to-Spanish translations of the European Parliament Plenary Sessions (EPPS).Peer Reviewe
The Automated Evaluation of Scientific Writing, or AESW, is the task of identifying sentences in need of correction to ensure their appropriateness in a scientific prose. The data set comes from a professional editing company, VTeX, with two aligned versions of the same text-before and after editing-and covers a variety of textual infelicities that proofreaders have edited. While previous shared tasks focused solely on grammatical errors (Dale and Kilgarriff, 2011; Dale et al., 2012; Ng et al., 2013; Ng et al., 2014), this time edits cover other types of linguistic misfits as well, including those that almost certainly could be interpreted as style issues and similar "matters of opinion". The latter arise because of different language editing traditions, experience, and the absence of uniform agreement on what "good" scientific language should look like. Initiating this task, we expected the participating teams to help identify the characteristics of "good" scientific language, and help create a consensus of which language improvements are acceptable (or necessary). Six participating teams took on the challenge.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.