Proceedings of the 11th Forum for Information Retrieval Evaluation 2019
DOI: 10.1145/3368567.3368572
|View full text |Cite
|
Sign up to set email alerts
|

Authorship Clustering using TF-IDF weighted Word-Embeddings

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
7
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 13 publications
(9 citation statements)
references
References 4 publications
0
7
0
Order By: Relevance
“…They showed that the former was beneficial for multi-topic texts but it was also more computationally demanding without achieving substantially better performance [20]. Agarwal et al utilized word embedding with tf-idf weights and employed hierarchical clustering algorithms to perform authorship clustering [1]. Kocher and Savoy adopted a simple set of features of the most frequent terms (words and punctuation) to represent the authorship and writing styles [14].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…They showed that the former was beneficial for multi-topic texts but it was also more computationally demanding without achieving substantially better performance [20]. Agarwal et al utilized word embedding with tf-idf weights and employed hierarchical clustering algorithms to perform authorship clustering [1]. Kocher and Savoy adopted a simple set of features of the most frequent terms (words and punctuation) to represent the authorship and writing styles [14].…”
Section: Related Workmentioning
confidence: 99%
“…Many authorial clustering approaches invest on advanced machine learning methods, like recurrent neural networks [4], word embedding [1] and sophisticated document representations [4,10,20] with thousands of dimensions. Highdimensional feature spaces, however, tend to get sparser as the texts get shorter and suffer from consequences like the curse of dimensionality.…”
Section: Introductionmentioning
confidence: 99%
“…TF-IDF algorithm is widely used in the following applications (Agarwal et al , 2019; Chang et al , 2020; Dong et al , 2019; Feng et al , 2019; Forman, 2008; Gebre et al , 2013; Huang et al , 2011; Kumar and Subba, 2020; Matsuo and Ishizuka, 2004; Saihanqiqige, 2020; Trstenjak et al , 2014; Yahav et al , 2018; Yunchun, 2019; Yun-Tao et al , 2005; Park et al , 2020): text categorization; keywords extraction; and new word recognition. …”
Section: Related Workmentioning
confidence: 99%
“…Authorship verification takes as input a set of authors and a set of documents and assigns each document to an author, while authorial clustering assumes that information on authors of documents is unavailable or unreliable. Authorial clustering seeks to partition the set of documents into clusters such that each cluster corresponds to one author [25] 1 .…”
Section: Introductionmentioning
confidence: 99%
“…Many authorial clustering approaches invest on advanced machine learning methods, like recurrent neural networks [3], word embeddings [1] and sophisticated document representations [3,10,23] in a space with thousands of dimensions. High-dimensional feature spaces, however, tend to get sparser as the texts get shorter and suffer from consequences like the curse of dimensionality.…”
Section: Introductionmentioning
confidence: 99%