18th International Conference on Pattern Recognition (ICPR'06) 2006
DOI: 10.1109/icpr.2006.899
|View full text |Cite
|
Sign up to set email alerts
|

On Authorship Attribution via Markov Chains and Sequence Kernels

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2010
2010
2018
2018

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 11 publications
0
5
0
Order By: Relevance
“…The idea was motivated by previous works using -gram models to discriminate text into categories according to genre [32], [33], [34], [35], authorship [32], [36], sentiment [37], [38], language [39], etc. We believe that such discriminative capability can be also exhibited by the TD and TO model components.…”
Section: B Text Classificationmentioning
confidence: 99%
“…The idea was motivated by previous works using -gram models to discriminate text into categories according to genre [32], [33], [34], [35], authorship [32], [36], sentiment [37], [38], language [39], etc. We believe that such discriminative capability can be also exhibited by the TD and TO model components.…”
Section: B Text Classificationmentioning
confidence: 99%
“…If we consider simplicity and language independence as primary factors, lexical features are expected to perform better than other features. Especially, the character n ‐gram representation has been used as one of the most effective measures of authorship attribution [13, 15]. If authors tend to use similar patterns in their writings, this would imply that syntactic and semantic features may lead to superior results.…”
Section: Related Workmentioning
confidence: 99%
“…Let S t denote the event that a section s [ {s 1 , …, s n } belongs to the target group (= not plagiarized); likewise, let S o denote the event that s belongs to the outlier Character n-gram frequency/ratio* Kjell et al (1994), Sanderson and Guenter (2006a), Juola (2006) and Koppel (2009) Average sentence length Holmes (1998) and Zheng et al (2006) Average number of syllables per word* Holmes (1998) Word frequency Mosteller and Wallace (1964), Holmes (1998) and Koppel (2009) Word n-grams frequency/ratio Sanderson and Guenter (2006a) Number of hapax legomena Tweedie and Baayen (1998) and Zheng et al…”
Section: Outlier Identificationmentioning
confidence: 99%