Miguel Núñez-del-Prado scite author profile

International audienceDue to the emergence of geolocated applications, more and more mobility traces are generated on a daily basis and collected in the form of geolocated datasets. If an unauthorized entity can access this data, it can used it to infer personal information about the individuals whose movements are contained within these datasets, such as learning their home and place of work or even their social network, thus causing a privacy breach. In order to protect the privacy of individuals, a sanitization process, which adds uncertainty to the data and removes some sensible information, has to be performed. The global objective of GEPETO (for GEoPrivacy Enhancing TOolkit) is to provide researchers concerned with geo-privacy with means to evaluate various sanitization techniques and inference attacks on geolocated data. In this paper, we report on our preliminary experiments with GEPETO for comparing different clustering algorithms and heuristics that can be used as inference attacks, and evaluate their efficiency for the identification of point of interests, as well as their resilience to sanitization mechanisms such as sampling and perturbation

show abstract

De-anonymization attack on geolocated data

Gambs

Killijian

Núñez-del-Prado

2014

Journal of Computer and System Sciences

134

109

View full text Add to dashboard Cite

Abstract-With the advent of GPS-equipped devices, a massive amount of location data is being collected, raising the issue of the privacy risks incurred by the individuals whose movements are recorded. In this work, we focus on a specific inference attack called the de-anonymization attack, by which an adversary tries to infer the identity of a particular individual behind a set of mobility traces. More specifically, we propose an implementation of this attack based on a mobility model called Mobility Markov Chain (MMC). A MMC is built out from the mobility traces observed during the training phase and is used to perform the attack during the testing phase. We design two distance metrics quantifying the closeness between two MMCs and combine these distances to build de-anonymizers that can re-identify users in an anonymized geolocated dataset. Experiments conducted on real datasets demonstrate that the attack is both accurate and resilient to sanitization mechanisms such as downsampling.

show abstract

De-anonymization Attack on Geolocated Data

Gambs

Killijian

Núñez-del-Prado³

2013

View full text Add to dashboard Cite

GEPETO: A GEoPrivacy-Enhancing TOolkit

Gambs

Killijian

Núñez-del-Prado

2010

View full text Add to dashboard Cite

Abstract-A geolocalised system generally belongs to an individual and as such knowing its location reveals the location of its owner, which is a direct threat against his privacy. To protect the privacy of users, a sanitization process, which adds uncertainty to the data and removes some sensible information, can be performed but at the cost of a decrease of utility due to the quality degradation of the data. In this paper, we introduce GEPETO (for GEoPrivacy-Enhancing TOolkit), a flexible open source software which can be used to visualize, sanitize, perform inference attacks and measure the utility of a particular geolocalised dataset. The main objective of GEPETO is to enable a user to design, tune, experiment and evaluate various sanitization algorithms and inference attacks as well as visualizing the following results and evaluating the resulting trade-off between privacy and utility.

show abstract

Impact of natural disasters on consumer behavior: Case of the 2017 El Niño phenomenon in Peru

et al. 2021

View full text Add to dashboard Cite

El Niño is an extreme weather event featuring unusual warming of surface waters in the eastern equatorial Pacific Ocean. This phenomenon is characterized by heavy rains and floods that negatively affect the economic activities of the impacted areas. Understanding how this phenomenon influences consumption behavior at different granularity levels is essential for recommending strategies to normalize the situation. With this aim, we performed a multi-scale analysis of data associated with bank transactions involving credit and debit cards. Our findings can be summarized into two main results: Coarse-grained analysis reveals the presence of the El Niño phenomenon and the recovery time in a given territory, while fine-grained analysis demonstrates a change in individuals’ purchasing patterns and in merchant relevance as a consequence of the climatic event. The results also indicate that society successfully withstood the natural disaster owing to the economic structure built over time. In this study, we present a new method that may be useful for better characterizing future extreme events.

show abstract

Towards the adaptation of SDC methods to stream mining

Rodríguez

Nin

Núñez-del-Prado

2017

Computers & Security

View full text Add to dashboard Cite

Survey of Text Mining Techniques Applied to Judicial Decisions Prediction

2022

View full text Add to dashboard Cite

This paper reviews the most recent literature on experiments with different Machine Learning, Deep Learning and Natural Language Processing techniques applied to predict judicial and administrative decisions. Among the most outstanding findings, we have that the most used data mining techniques are Support Vector Machine (SVM), K Nearest Neighbours (K-NN) and Random Forest (RF), and in terms of the most used deep learning techniques, we found Long-Term Memory (LSTM) and transformers such as BERT. An important finding in the papers reviewed was that the use of machine learning techniques has prevailed over those of deep learning. Regarding the place of origin of the research carried out, we found that 64% of the works belong to studies carried out in English-speaking countries, 8% in Portuguese and 28% in other languages (such as German, Chinese, Turkish, Spanish, etc.). Very few works of this type have been carried out in Spanish-speaking countries. The classification criteria of the works have been based, on the one hand, on the identification of the classifiers used to predict situations (or events with legal interference) or judicial decisions and, on the other hand, on the application of classifiers to the phenomena regulated by the different branches of law: criminal, constitutional, human rights, administrative, intellectual property, family law, tax law and others. The corpus size analyzed in the reviewed works reached 100,000 documents in 2020. Finally, another important finding lies in the accuracy of these predictive techniques, reaching predictions of over 60% in different branches of law.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.