The work is devoted to methods for comparing and classifying graphs. This trend is known as "graph matching". An overview of metrics for comparing graphs based on the maximum common subgraph is given. A modification of the distance based on the maximum common subgraph, which takes into account the ordering of the vertices, is proposed. It is shown that this function satisfies all the properties of the metric (non-negativity, identity, symmetry, triangle inequality).
The article focuses on identifying the importance of folk song objects with mathematical methods. Using Zaonezhye get-together songs of the XIX - early XX century as the research material the author develops a graph-theoretical model representing the sematic structure of a text and calculates the quantitative characteristics of this model. They allow evaluating the importance of folk song objects at the local and textual levels. The research was conducted using the “Folklore” informational system.
В работе рассматривается совокупность статей Ф.М.Достоевского и других авторов (М.М.Достоевский, Н.Н. Страхов, А.А.Головачев, И.Н.Шилль , А.Григорьев, А.У.Порецкий , Я. П. Полонский), опубликованных в журналах «Время» и «Эпоха» в период 1861-1865 гг. В текстах выделялись фрагменты размером 500, 700 и 1000 слов. При этом для увеличения объема выборки использовался шаг для отсчета начала следующего фрагмента: 100, 200 слов и т.п. На основе частеречного распределения фрагментов текстов были построены деревья решений, в узлах которых находятся условия ветвления, основанные на частоте встречаемости той или иной n-граммы (последовательности из n закодированных частей речи). Анализ сильных позиций данных текстов (т.е. фрагментов, расположенных в начале или в конце текста) с помощью деревьев решений показывает возможность стилистической правки, которую вносил Ф. М. Достоевский в тексты изначальных авторов. Для проведения исследования использовалась информационная система СМАЛТ («Статистические методы анализа литературных текстов»), где была реализована автоматизированная разметка произведений с ручным контролем специалистов-филологов.
One of the problems of automatic text processing is their attribution. This term is understood as the establishment of the attributes of a text work (determination of authorship, time of creation, place of recording, etc.). The article presents a generalized context-dependent graph-theoretic model designed for the analysis of folklore and literary texts. The minimal structural unit of the model (primitive) is a word. Sets of words are combined into vertices, and the same word can be related to different vertices. Edges and graph substructures reflect the lexical, syntactic and semantic links of the text. The characteristics of the model are its fuzziness, hierarchy and temporality. As examples, a hierarchical graph-theoretical model of components (on the example of literary works by A. S. Pushkin), a temporal graph-theoretic model of a fairy tale plot (on the example of Russian fairy tales by A. M. Afanasyev) and a fuzzy graph-theoretic model of «strong» connections of grammatical classes (on the example of anonymous articles from the pre-revolutionary magazines «Time», «Epoch» and the weekly «Citizen», edited by F. M. Dostoevsky). The model is built in such a way that it can be further explored using artificial intelligence methods (for example, decision trees or neural networks). For this purpose, a format for storing such data was implemented in the information system «Folklore», as well as procedures for entering, editing and analyzing texts and their graph-theoretic models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.