Ksenia Lagutina scite author profile

The article is devoted to the analysis of the rhythm of texts of different genres: fiction novels, advertisements, scientific articles, reviews, tweets, and political articles. The authors identified lexico-grammatical figures in the texts: anaphora, epiphora, diacope, aposiopesis, etc., that are markers of the text rhythm. On their basis, statistical features were calculated that describe quantitatively and structurally these rhythm features.The resulting text model was visualized for statistical analysis using boxplots and heat maps that showed differences in the rhythm of texts of different genres. The boxplots showed that almost all genres differ from each other in terms of the overall density of rhythm features. Heatmaps showed different rhythm patterns across genres. Further, the rhythm features were successfully used to classify texts into six genres. The classification was carried out in two ways: a binary classification for each genre in order to separate a particular genre from the rest genres, and a multi-class classification of the text corpus into six genres at once. Two text corpora in English and Russian were used for the experiments. Each corpus contains 100 fiction novels, scientific articles, advertisements and tweets, 50 reviews and political articles, i.e. a total of 500 texts. The high quality of the classification with neural networks showed that rhythm features are a good marker for most genres, especially fiction. The experiments were carried out using the ProseRhythmDetector software tool for Russian and English languages. Text corpora contains 300 texts for each language.

show abstract

A Survey of Models for Constructing Text Features to Classify Texts in Natural Language

Lagutina

2021

View full text Add to dashboard Cite

Methodolo- gical Aspects of Semantic Relationship Extraction for Automatic Thesaurus Generation

Lagutina¹,

Lagutina²,

Mamedov³

et al. 2016

Model. anal. inf. sist.

View full text Add to dashboard Cite

Evaluating the Performance of a New Text Rhythm Analysis Tool

Boychuk

Lagutina

Vorontsova

et al. 2020

ESNBU

View full text Add to dashboard Cite

The paper assesses and evaluates the performance of the ProseRhythmDetector (PRD) Text Rhythm Analysis Tool. The research is a case study of 50 English and 50 Russian fictional texts (approximately 88,000 words each) from the 19th to the 21st century. The paper assesses the PRD tool accuracy in detecting stylistic devices containing repetition in their structure such as diacope, epanalepsis, anaphora, epiphora, symploce, epizeuxis, anadiplosis, and polysyndeton. The article ends by discussing common errors, analysing disputable cases and highlighting the use of the tool for author and idiolect identification.

show abstract

12 3 4

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ksenia Lagutina

A Survey on Stylometric Text Features

Automatic Extraction of Rhythm Figures and Analysis of Their Dynamics in Prose of 19th-21st Centuries

Authorship Verification of Literary Texts with Rhythm Features

Thesaurus-Based Method of Increasing Text-via-Keyphrase Graph Connectivity During Keyphrase Extraction for e-Tourism Applications

Text Classification by Genre Based on Rhythm Features

A Survey of Models for Constructing Text Features to Classify Texts in Natural Language

Methodolo- gical Aspects of Semantic Relationship Extraction for Automatic Thesaurus Generation

Evaluating the Performance of a New Text Rhythm Analysis Tool

Contact Info

Product

Resources

About