Proceedings of the Sixth Workshop on Noisy User-Generated Text (W-Nut 2020) 2020
DOI: 10.18653/v1/2020.wnut-1.30
|View full text |Cite
|
Sign up to set email alerts
|

Representation learning of writing style

Abstract: In this paper, we introduce a new method of representation learning that aims to embed documents in a stylometric space. Previous studies in the field of authorship analysis focused on feature engineering techniques in order to represent document styles and to enhance model performance in specific tasks. Instead, we directly embed documents in a stylometric space by relying on a reference set of authors and the intra-author consistency property which is one of two components in our definition of writing style.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 13 publications
(18 citation statements)
references
References 32 publications
(30 reference statements)
0
18
0
Order By: Relevance
“…Style embedding yields promising results. The method "deepstyle" (Hay et al, 2020) performs well across STEL components (0.66). It performs the worst on the simple/complex dimension (0.55).…”
Section: Resultsmentioning
confidence: 99%
See 4 more Smart Citations
“…Style embedding yields promising results. The method "deepstyle" (Hay et al, 2020) performs well across STEL components (0.66). It performs the worst on the simple/complex dimension (0.55).…”
Section: Resultsmentioning
confidence: 99%
“…Other Methods. We also experiment with the "deepstyle" model (Hay et al, 2020) by taking the cosine similarity between the style vector representations. Additionally, we consider the following sentence features: NLTK POS Tags (Bird et al, 2009) and share of cased characters (e.g., Sari et al (2018)) via the cosine similarity between the frequency vectors and 1 -the difference between the proportion of cased characters respectively.…”
Section: Style Measuring Methodsmentioning
confidence: 99%
See 3 more Smart Citations