Analysis of Keystroke Dynamics for the Generation of Synthetic Datasets

Migdal, Denis; Rosenberger, Christophe

doi:10.1109/cw.2018.00068

Cited by 7 publications

(10 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This invited article supports and improves the results of the original "Analysis of Keystroke Dynamics For the Generation of Synthetic Datasets" [10].…”

Section: Introductionsupporting

confidence: 63%

Statistical modeling of keystroke dynamics samples for the generation of synthetic datasets

Migdal

Rosenberger

2019

Future Generation Computer Systems

Self Cite

View full text Add to dashboard Cite

Biometrics is an emerging technology more and more present in our daily life. However, building biometric systems requires a large amount of data that may be difficult to collect. Collecting such sensitive data is also very time consuming and constrained, s.a. GDPR legislation in Europe. In the case of keystroke dynamics, most existing databases have less than 200 users. For these reasons, it is crucial for this biometric modality to be able to generate a significant and realistic synthetic dataset of keystroke dynamics samples. We propose in this paper an original approach for the generation of synthetic keystroke data given samples from known users as a first step towards the generation of synthetic datasets. Experimental results show the capability of the proposed statistical model to generate realistic samples from existing datasets in the literature.

show abstract

“…This invited article supports and improves the results of the original "Analysis of Keystroke Dynamics For the Generation of Synthetic Datasets" [10].…”

Section: Introductionsupporting

confidence: 63%

Statistical modeling of keystroke dynamics samples for the generation of synthetic datasets

Migdal

Rosenberger

2019

Future Generation Computer Systems

Self Cite

View full text Add to dashboard Cite

show abstract

“…To our best knowledge, this is the first systematic attempt to compare several distributions for fitting keystroke dynamics timing profiles when the text is not short and fixed, as in a password or a passphrase. Attempting to overcome the limitations in existing datasets, Migdal and Rosenberger [11,12] have carried out a detailed comparison of almost twenty candidate distributions for the generation of synthetic datasets using statistical models; the Gumbel distribution provided the best overall fit. Our approach differs in the target tasks that were considered and the evaluation criteria; while theirs, using the GREYC dataset [25], represents short fixed texts like usernames and passwords that the user has typed repeatedly, ours is focused on free text composition and transcription tasks.…”

Section: Previous Studiesmentioning

confidence: 99%

“…Going beyond authentication, [28] and [29] employ the sigma-lognormal model of rapid human movements to detect the age group of users based on their interaction with a touch screen, while [30] leverages different distributions to discriminate a human user from a bot. No other systematic comparison of distributions for the task of fitting keystroke timings histograms was found other than the aforementioned [21], [22], and [11 1.…”

Section: Previous Studiesmentioning

confidence: 99%

“…No claim is made about the shape of timing distributions generated by other types of writing tasks; in particular, password typing and short fixed texts were not considered. The reader interested in these cases is referred to [11,12].…”

Section: Limitations Of This Studymentioning

confidence: 99%

“…Considering that most distance metrics and classification methods are sensitive to discrepancies between the assumed model and the empirical data, it is puzzling that a systematic study of histogram shapes was not an early step in the discipline. Not long ago a systematic comparison of a large number of candidates has been carried out [11,12] but, unfortunately, it is restricted to fixed text.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

On the shape of timings distributions in free-text keystroke dynamics profiles

González

Calot

Ierache

et al. 2021

Heliyon

View full text Add to dashboard Cite

Keystroke dynamics is a soft biometric trait. Although the shape of the timing distributions in keystroke dynamics profiles is a central element for the accurate modeling of the behavioral patterns of the user, a simplified approach has been to presuppose normality. Careful consideration of the individual shapes for the timing models could lead to improvements in the error rates of current methods or possibly inspire new ones. The main objective of this study is to compare several heavy-tailed and positively skewed candidate distributions in order to rank them according to their merit for fitting timing histograms in keystroke dynamics profiles. Results are summarized in three ways: counting how many times each candidate distribution provides the best fit and ranking them in order of success, measuring average information content, and ranking candidate distributions according to the frequency of hypothesis rejection with an Anderson-Darling goodness of fit test. Seven distributions with two parameters and seven with three were evaluated against three publicly available free-text keystroke dynamics datasets. The results confirm the established use in the research community of the log-normal distribution, in its two-and three-parameter variations, as excellent choices for modeling the shape of timings histograms in keystroke dynamics profiles. However, the log-logistic distribution emerges as a clear winner among all two-and three-parameter candidates, consistently surpassing the log-normal and all the rest under the three evaluation criteria for both hold and flight times.

show abstract

Deep features fusion for user authentication based on human activity

Piugie

Charrier

Manno³

et al. 2023

IET Biometrics

View full text Add to dashboard Cite

The exponential growth in the use of smartphones means that users must constantly be concerned about the security and privacy of mobile data because the loss of a mobile device could compromise personal information. To address this issue, continuous authentication systems have been proposed, in which users are monitored transparently after initial access to the smartphone. In this study, the authors address the problem of user authentication by considering human activities as behavioural biometric information. The authors convert the behavioural biometric data (considered as time series) into a 2D colour image. This transformation process keeps all the characteristics of the behavioural signal. Time series does not receive any filtering operation with this transformation, and the method is reversible. This signal-to-image transformation allows us to use the 2D convolutional networks to build efficient deep feature vectors. This allows them to compare these feature vectors to the reference template vectors to compute the performance metric. The authors evaluate the performance of the authentication system in terms of Equal Error Rate on a benchmark University of Californy, Irvine Human Activity Recognition dataset, and they show the efficiency of the approach.

show abstract

Analysis of Keystroke Dynamics for the Generation of Synthetic Datasets

Cited by 7 publications

References 16 publications

Statistical modeling of keystroke dynamics samples for the generation of synthetic datasets

Statistical modeling of keystroke dynamics samples for the generation of synthetic datasets

On the shape of timings distributions in free-text keystroke dynamics profiles

Deep features fusion for user authentication based on human activity

Contact Info

Product

Resources

About