2014
DOI: 10.1007/978-3-319-11382-1_13
|View full text |Cite
|
Sign up to set email alerts
|

Authorship Identification Using Dynamic Selection of Features from Probabilistic Feature Set

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 12 publications
0
5
0
Order By: Relevance
“…Simple unigram (i.e., n = 1) and bigram (i.e., n = 2) features can hardly capture the relationship among nouns across the whole sentence, and the relationship between each bigram/trigram is considered independent. Second, the current n-gram approach heavily depends on the feature selection method [Zamani et al 2014a;Savoy 2013a;Pavlyshenko 2014]. The space of the complete n-gram (n ∈ N) features is indeed sparse and can be greatly compressed for the problem of authorship analysis.…”
Section: Joint Learning Model For Topical Modality and Lexical Modalitymentioning
confidence: 99%
See 1 more Smart Citation
“…Simple unigram (i.e., n = 1) and bigram (i.e., n = 2) features can hardly capture the relationship among nouns across the whole sentence, and the relationship between each bigram/trigram is considered independent. Second, the current n-gram approach heavily depends on the feature selection method [Zamani et al 2014a;Savoy 2013a;Pavlyshenko 2014]. The space of the complete n-gram (n ∈ N) features is indeed sparse and can be greatly compressed for the problem of authorship analysis.…”
Section: Joint Learning Model For Topical Modality and Lexical Modalitymentioning
confidence: 99%
“…During the feature engineering process, given the available dataset and application scenario, authorship analysts manually select a broad set of features based on the hypotheses or educated guesses, and then refine the selection based on the experimental feedback. As demonstrated by previous research [Savoy 2012;Zamani et al 2014a;Savoy 2013b;Ding et al 2015], the choice of the feature set (i.e., the feature selection method) is a crucial indicator of the prediction result, and it requires explicit knowledge in computational linguistics and tacit experiences in analyzing the textual data. Manual feature engineering is a time-consuming and labor-intensive task.…”
Section: Introductionmentioning
confidence: 99%
“…He found that the effect of this algorithm was significantly affected by the corpus size when it was used alone [ 21 ]. Zamani H. et al proposed the maximum likelihood estimation distribution model of lexical and syntactic features as the feature set, and gave the distance calculation method between feature sets and feature selection method, which enhanced the interpretability of multi-level feature sets [ 22 ].…”
Section: Literature Reviewmentioning
confidence: 99%
“…In Table 2, the performance of our model, compared to the winner and second ranked of the English literary text section of the shared task (cf. (Modaresi and Gross, 2014) and (Zamani et al, 2014) Table 2: Performance of our model compared to other participants on the "PANLiterary" dataset as the best performing approach of the shared task, the META-CLASSIFIER (MC), by a large margin. The task baseline is the best-performing language-independent approach of the PAN-2013 shared task.…”
Section: Pan Author Verificationmentioning
confidence: 99%