In this paper we describe a schema and models which have been developed for the representation of corpora of computer-mediated communication (CMC corpora) using the representation framework provided by the Text Encoding Initiative (TEI). The schema presented here is the result of the activities and discussions within an international community of researchers who have been building, annotating and processing CMC data for the integration into corpus infrastructures (CLARIN, ORTOLANG) and use these corpora for purposes of linguistic research on linguistic variation and language change in and through the impact of internet-based communication technologies and applications. Discourse in the scope of CMC corpora (= "computer-mediated communication") is characterised as dialogic, sequentially organised interchange between humans which is conducted using communication technologies such as chats, messengers, online forums; social media platforms and applications such as Twitter, Facebook, Instagram or WhatsApp; the communication functions of collaborative platforms and projects (e.g. in the Wikipedia or in learning environments); or 3D environments (e.g. Second Life, gaming environments). 2 Discourse found in CMC exhibits features that cannot be adequately handled by schemas and tools developed for the representation, annotation and processing of discourse that conforms to the written standard and the structural conventions of established text types (e.g., newspaper articles, prose, scientific articles). It also significantly differs from the language and structure of spoken conversation so that CMC-core: a schema for the representation of CMC corpora in TEI Corpus, 20 | 2020 Listing 4. Written and spoken post in WhatsApp chat interaction including an emoji, adapted to CMC-core. From the corpus MoCoDa2 (2018) Listing 3. A blog comment, replying to a previous comment. From the Scilogs corpus, adapted to CMC-core (Grumt Suárez et al. 2016) CMC-core: a schema for the representation of CMC corpora in TEI