2015
DOI: 10.1016/j.jcss.2014.12.019
|View full text |Cite
|
Sign up to set email alerts
|

Authorship verification of e-mail and tweet messages applied for continuous authentication

Abstract: Authorship verification using stylometry consists of identifying a user based on his writing style. In this paper, authorship verification is applied for continuous authentication using unstructured online text-based entry. An online document is decomposed into consecutive blocks of short texts over which (continuous) authentication decisions happen, discriminating between legitimate and impostor behaviors. We investigate blocks of texts with 140, 280 and 500 characters. The feature set includes traditional fe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0
2

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 52 publications
(21 citation statements)
references
References 35 publications
(49 reference statements)
0
19
0
2
Order By: Relevance
“…The average performance of best systems in all PAN corpora (see Table 3) indicates that the error rate of state-of-the-art methods in authorship verification is around 20%. 7 Although this is too high in comparison to the most effective technologies used to provide forensic evidence (e.g., the error rate of DNA analysis is less than 1% [28]), it is comparable to other technologies that analyse noisy data, like latent fingerprint matching [12] or speaker identification [8]. The relatively higher AUROC scores indicate that the verification models are able to rank answers more effectively and they Fig.…”
Section: Discussionmentioning
confidence: 95%
See 2 more Smart Citations
“…The average performance of best systems in all PAN corpora (see Table 3) indicates that the error rate of state-of-the-art methods in authorship verification is around 20%. 7 Although this is too high in comparison to the most effective technologies used to provide forensic evidence (e.g., the error rate of DNA analysis is less than 1% [28]), it is comparable to other technologies that analyse noisy data, like latent fingerprint matching [12] or speaker identification [8]. The relatively higher AUROC scores indicate that the verification models are able to rank answers more effectively and they Fig.…”
Section: Discussionmentioning
confidence: 95%
“…When multiple labelled texts are available, they combine the answers to provide the final decision. Typical examples of this category are described by Seidman [55], Jankowska et al [22], Moreau et al [41], Brocardo et al [7], and Castro-Castro et al [9]. Another variation is to first concatenate all labelled texts and then split the resulting text into samples of equal size [5,17].…”
Section: Verification Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…Another distinction is that the labels of the test set are also known, because student-produced content is not anonymous. The analyses were For more accurate examinations of the approach, experimental evaluation was conducted using the 10-fold cross-validation method, and used the same train/test split ratio as many of the other authorship studies (Brocardo, Traore, & Woungang, 2015;Schmid, Iqbal, & Fung, 2015). Data were randomly split into training and testing sets where 90% of the data were allocated for training and the remaining 10% allocated for testing.…”
Section: Methodsmentioning
confidence: 99%
“…In our proposed n ‐gram model, we measure the degree of similarity between a block b of characters and the profile of a user U , for details see our previous work in the work of Brocardo et al We analyze whether or not a specific n ‐gram is present and compute new features by defining 2 similarity metrics: r U ( b , m ) the real‐valued similarity and d U ( b ) the binary similarity. For determining the n ‐gram, 2 modes have been considered: the unique n ‐grams mode denoted by m = 0 and the all n ‐grams mode denoted by m = 1, where m is a binary variable.…”
Section: Feature Spacementioning
confidence: 99%