Václav Cvrček scite author profile

This paper is part of a larger research effort on language variability aimed at uncovering the relations between extra- and intratextual characteristics of Czech texts by means of multi-dimensional analysis. The palpable lack of prior art on quantitative register analysis of Czech led to several distinctive methodological decisions, concerning namely corpus design, feature selection and the parameters of factor analysis, especially the number of dimensions to extract. We report on these for their potential relevance to other researchers embarking on a similar journey. In order to demonstrate the viability of the model, we also present a brief interpretation of the resulting dimensions.

show abstract

A Data-Driven Analysis of Reader Viewpoints: Reconstructing the Historical Reader Using Keyword Analysis

Fidler¹,

Cvrček²

2015

jsl

View full text Add to dashboard Cite

This study uses corpus-linguistic methods to examine the relationship between language usage patterns and divergence in text interpretation. Our target of analysis is a set of texts (Czechoslovak presidential New Year’s addresses from 1975 to 1989), which contemporary readers consider repetitious and devoid of content. These texts were statistically contrasted with corpora from two different periods: one from the totalitarian period and the other from the contemporary (post-totalitarian) period. The comparison was based on the Difference Index, the most recent effect-size estimator, which was used to enhance the interpretation of keyword analysis outcomes. The two analyses yield significantly different results: the data from the analysis using the contemporary corpus were commensurate with contemporary readers’ impressions; those from the analysis using the totalitarian corpus fluctuated in tandem with (and sometimes in anticipation of) political and social changes during the 15-year period and suggested an interpretation of the texts by a reader more familiar with totalitarian texts.

show abstract

No keyword is an island: in search of covert associations

Cvrček

Fidler

2022

Corpora

View full text Add to dashboard Cite

This paper describes how corpus-assisted discourse analysis based on keyword identification and interpretation can benefit from employing Market Basket Analysis (mba) after keyword extraction. mba is a data mining technique used originally in marketing that can reveal consistent associations between items in a shopping cart, but also between keywords in a corpus of many texts. By identifying recurring associations between keywords, we can compensate for the lack of wider context which is a major issue impeding the interpretation of isolated keywords (especially when analysing large data). To showcase the advantages of mba in ‘re-contextualising’ keywords within the discourse, we conducted a pilot study on the topic of migration, contrasting anti-system and centre-right Czech Internet media. The results show that mba is useful in identifying the dominant strategy of anti-system news portals: to weave in a confounding ideological undercurrent and connect the concept of migrants to a multitude of other topics (i.e., flooding the discourse).

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Václav Cvrček

Simplification in translated Czech: a new approach to type-token ratio

Comparing web-crawled and traditional corpora

From extra- to intratextual characteristics: Charting the space of variation in Czech through MDA

A Data-Driven Analysis of Reader Viewpoints: Reconstructing the Historical Reader Using Keyword Analysis

No keyword is an island: in search of covert associations

Contact Info

Product

Resources

About