As a corpus is a representation of the linguistic reality, it is important to have homogeneous, quantifiable and valid data. This article aims at discussing the issue of elaborating a corpus of oral data from language learners of Spanish. We hereby do not merely focus on the data collection, but also on the difficulties that arise regarding the experimental design, the selection of the participants, the elaboration of a transcription model and the analysis of the data. The discussion will be based upon our own research project, for which oral samples from Spanish language learners of different proficiency levels have been collected in order to be analysed cross-sectionally. Furthermore, this article focuses on the oral experiment specifically designed for this project, similar to those of previous studies on similar subjects. Next to this, we will also discuss the procedure used for the transcription of the data and finally, a codification system will be elaborated.
In this paper, some preliminary results on the use of pronouns in oral discourse of language learners of Spanish will be discussed. The article mainly focuses on the use of different kinds of personal pronouns and the pro-drop phenomenon, namely the existence of a null subject, typical of the Spanish language. The absence of an explicit subject due to a rich verbal conjugation opposes Spanish to other languages, such as French, English and Dutch, where an explicit subject pronoun is obligatory.As to investigate the use of the pronouns by language learners of Spanish, we compiled a corpus of oral productions of second language learners of Spanish who are all native speakers of Dutch and also learned French and English, which means that for them the pro-drop phenomenon is new. We will investigate which kinds of pronouns are used in which syntactic contexts and indicate in what contexts the use of a pronoun is not required. Next to this, we observe in our learners' corpus an unnecessary repetition of proper names and an over-use of personal pronouns as subjects. This can be related to the concept of "over-explicitation" or "overspecification", whereby learners of a second language tend to use more explicit forms than necessary.
The present investigation studies clean topic-continuity and topic-shift in a corpus of oral narratives produced by Dutch-speaking learners of Spanish as a foreign language. It thereby investigates to what extent the learners use the null pronoun, the explicit personal pronoun and the proper name in order to mark both topic-continuity and topic-shift, considering the different referential options in Dutch and Spanish, respectively non-pro-drop and pro-drop languages. Embedded in the cognitive linguistics tradition, the study is primarily based upon the Accessibility Theory (Ariel 1990) and the accessibility scales created by Ariel and Figueras Solanilla (2002). The corpus data were annotated in Excel and in the UAM CorpusTool.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.