Laida Kushnareva scite author profile

Laida Kushnareva

5Publications

16Citation Statements Received

40Citation Statements Given

How they've been cited

How they cite others

Affiliations

Publications

Order By: Most citations

Artificial Text Detection via Examining the Topology of Attention Maps

Kushnareva¹,

Cherniavskii²,

Mikhailov³

et al. 2021

View full text Add to dashboard Cite

The impressive capabilities of recent generative models to create texts that are challenging to distinguish from the human-written ones can be misused for generating fake news, product reviews, and even abusive content. Despite the prominent performance of existing methods for artificial text detection, they still lack interpretability and robustness towards unseen models. To this end, we propose three novel types of interpretable topological features for this task based on Topological Data Analysis (TDA) which is currently understudied in the field of NLP. We empirically show that the features derived from the BERT model outperform count-and neural-based baselines up to 10% on three common datasets, and tend to be the most robust towards unseen GPT-style generation models as opposed to existing methods. The probing analysis of the features reveals their sensitivity to the surface and syntactic properties. The results demonstrate that TDA is a promising line with respect to NLP tasks, specifically the ones that incorporate surface and structural information.

show abstract

Topological Data Analysis for Speech Processing

Tulchinskii¹,

Kuznetsov²,

Kushnareva³

et al. 2023

View full text Add to dashboard Cite

Acceptability Judgements via Examining the Topology of Attention Maps

Cherniavskii¹,

Tulchinskii²,

Mikhailov³

et al. 2022

View full text Add to dashboard Cite

The role of the attention mechanism in encoding linguistic knowledge has received special interest in NLP. However, the attention heads' ability to judge the grammatical acceptability of a sentence has been underexplored. This paper approaches the paradigm of acceptability judgments with topological data analysis (TDA), showing that the topological properties of the attention graph can be efficiently exploited for two standard practices in linguistics: binary judgments and linguistic minimal pairs. Topological features enhance the BERTbased acceptability classifier scores by up to 0.24 Matthew's correlation coefficient score on COLA in three languages (English, Italian, and Swedish). By revealing the topological discrepancy between attention graphs of minimal pairs, we achieve the human-level performance on the BLIMP benchmark, outperforming nine statistical and Transformer LM baselines. At the same time, TDA provides the foundation for analyzing the linguistic functions of attention heads and interpreting the correspondence between the graph features and grammatical phenomena. We publicly release the code and other materials used in the experiments 1 . * Equal contribution. 1 github.com/danchern97/tda4la

show abstract

Acceptability Judgements via Examining the Topology of Attention Maps

Cherniavskii¹,

Tulchinskii²,

Mikhailov³

et al. 2022

Preprint

View full text Add to dashboard Cite

Category-Learning with Context-Augmented Autoencoder

Kuzminykh¹,

Kushnareva²,

Grigoryev³

et al. 2020

Preprint

View full text Add to dashboard Cite

Finding an interpretable non-redundant representation of real-world data is one of the key problems in Machine Learning. Biological neural networks are known to solve this problem quite well in unsupervised manner, yet unsupervised artificial neural networks either struggle to do it or require fine tuning for each task individually. We associate this with the fact that a biological brain learns in the context of the relationships between observations, while an artificial network does not. We also notice that, though a naive data augmentation technique can be very useful for supervised learning problems, autoencoders typically fail to generalize transformations from data augmentations. Thus, we believe that providing additional knowledge about relationships between data samples will improve model's capability of finding useful inner data representation. More formally, we consider a dataset not as a manifold, but as a category, where the examples are objects. Two these objects are connected by a morphism, if they actually represent different transformations of the same entity. Following this formalism, we propose a novel method of using data augmentations when training autoencoders. We train a Variational Autoencoder in such a way, that it makes transformation outcome predictable by auxiliary network in terms of the hidden representation. We believe that the classification accuracy of a linear classifier on the learned representation is a good metric to measure its interpretability. In our experiments, present approach outperforms β-VAE and is comparable with Gaussian-mixture VAE.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.