Encephaloclastic porencephaly emerged as a problem at a time when the use of chest physiotherapy had decreased. The cluster of cases seen between 1992 and 1994, although associated with the number of chest physiotherapy treatments given, began to appear because of some other factor.
This paper reports on the construction of CANELC: the Cambridge and Nottingham e-language Corpus 3 . CANELC is a one million word corpus of digital communication in English, taken from online discussion boards, blogs, tweets, emails and SMS messages. The paper outlines the approaches used when planning the corpus: obtaining consent; collecting the data and compiling the corpus database. This is followed by a detailed analysis of some of the patterns of language used in the corpus. The analysis includes a discussion of the key words and phrases used as well as the common themes and semantic associations connected with the data. These discussions form the basis of an investigation of how e-language operates in both similar and different ways to spoken and written records of communication (as evidenced by the BNC -British National Corpus).
This paper takes stock of the current state-of-the-art in multimodal corpus linguistics, and proposes some projections of future developments in this field. It provides a critical overview of key multimodal corpora that have been constructed over the past decade and presents a wish-list of future technological and methodological advancements that may help to increase the availability, utility and functionality of such corpora for linguistic research.
CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes-National Corpus of Contemporary Welsh) is the first comprehensive corpus of Welsh designed to be reflective of language use across communication types, genres, speakers, language varieties (regional and social) and contexts. This article focuses on the computational infrastructure that we have designed to support data collection for CorCenCC, and the subsequent uses of the corpus which include lexicography, pedagogical research and corpus analysis. A grassroots approach to design has been adopted, that has adapted and extended previous corpus-building and introduced new features as required for this specific context and language. The key pillars of the infrastructure include a framework that supports metadata collection, an innovative mobile application designed to collect spoken data (utilising a crowdsourcing approach), a backend database that stores curated data and a web-based interface that allows users to query the data online. A usability study was conducted to evaluate the user facing tools and to suggest directions for future improvements. Though the infrastructure was developed for Welsh language collection, its design can be re-used to support corpus development in other minority or major language contexts, broadening the potential utility and impact of this work. Keywords Language resources Á Natural language processing Á Data modelling Á Information retrieval Á Web interfaces Á Usability testing
This chapter provides a corpus-based analysis of formality in e-language. It examines how levels of formality differ from one 'mode' of e-language to the next, and how these collectively compare to spoken and written discourse, providing the foundations for enhancing our descriptions and understanding of e-language use. The chapter focuses on common indicators of formality in discourse with particular reference to the use of hedging. It profiles the use of specific varieties of this phenomenon, paying particular attention to how the frequency and use of hedges compares from different modes of e-language and text topics to the next, and, more generally, how they compare to one-million-word samples of data taken from the written and spoken BNC. The analyses are based on the newly constructed one-million-word CANELC corpus of digital English. CANELC stands for the Cambridge and Nottingham e-language Corpus. It contains data from online discussion boards, blogs, tweets, emails and SMS messages. The data covers a range of different discursive topics, from the more public concerns of 'news, media and current affairs', through to 'teaching, academia and education', 'hobbies and pastimes', 'music', 'celebrity news and gossip' to 'personal and daily life'.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.