Fernando Reis scite author profile

Mobile phone data are an interesting new data source for official statistics. However, multiple problems and uncertainties need to be solved before these data can inform, support or even become an integral part of statistical production processes. In this paper, we focus on arguably the most important problem hindering the application of mobile phone data in official statistics: detecting home locations. We argue that current efforts to detect home locations suffer from a blind deployment of criteria to define a place of residence and from limited validation possibilities. We support our argument by analysing the performance of five home detection algorithms (HDAs) that have been applied to a large, French, Call Detailed Record (CDR) dataset (~18 million users, 5 months). Our results show that criteria choice in HDAs influences the detection of home locations for up to about 40% of users, that HDAs perform poorly when compared with a validation dataset (the 35°-gap), and that their performance is sensitive to the time period and the duration of observation. Based on our findings and experiences, we offer several recommendations for official statistics. If adopted, our recommendations would help in ensuring a more reliable use of mobile phone data vis-à-vis official statistics.

show abstract

Measuring output quality for multisource statistics in official statistics: Some directions

Agafiţei¹,

Gras²,

Kloek³

et al. 2015

SJI

View full text Add to dashboard Cite

Abstract. Many statistical offices have been moving towards an increased use of administrative data sources for statistical purposes, both as a substitute and as a complement to survey data. Moreover, the emergence of big data constitutes a further increase in available sources. As a result, statistical output in official statistics is increasingly based on complex combinations of sources. The quality of such statistics depends on the quality of the primary sources and on the ways they are combined. This paper analyses the appropriateness of the current set of output quality measures for multiple source statistics, it explains the need for improvement and outlines directions for further work. The usual approach for measuring the quality of the statistical output is to assess quality through the measurement of the input and process quality. The paper argues that in multisource production environment this approach is not sufficient. It advocates measuring quality on the basis of the output itself -without analysing the details of the inputs and the production process -and proposes directions for further development.

show abstract

On a Modular Approach to the Design of Integrated Social Surveys

Ioannidis

Merkouris

Zhang

et al. 2016

View full text Add to dashboard Cite

This article considers a modular approach to the design of integrated social surveys. The approach consists of grouping variables into 'modules', each of which is then allocated to one or more 'instruments'. Each instrument is then administered to a random sample of population units, and each sample unit responds to all modules of the instrument. This approach offers a way of designing a system of integrated social surveys that balances the need to limit the cost and the need to obtain sufficient information. The allocation of the modules to instruments draws on the methodology of split questionnaire designs. The composition of the instruments, that is, how the modules are allocated to instruments, and the corresponding sample sizes are obtained as a solution to an optimisation problem. This optimisation involves minimisation of respondent burden and data collection cost, while respecting certain design constraints usually encountered in practice. These constraints may include, for example, the level of precision required and dependencies between the variables. We propose using a random search algorithm to find approximate optimal solutions to this problem. The algorithm is proved to fulfil conditions that ensure convergence to the global optimum and can also produce an efficient design for a split questionnaire.

show abstract

A toolbox for a modular design and pooled analysis of sample survey programmes

Karlberg¹,

Reis²,

Calizzani³

et al. 2015

SJI

View full text Add to dashboard Cite

Improving Time Use Measurement with Personal Big Data Collection – The Experience of the European Big Data Hackathon 2019

Zeni

Bison

Reis

et al. 2021

View full text Add to dashboard Cite

This article assesses the experience with i-Log at the European Big Data Hackathon 2019, a satellite event of the New Techniques and Technologies for Statistics (NTTS) conference, organised by Eurostat. i-Log is a system that enables capturing personal big data from smartphones’ internal sensors to be used for time use measurement. It allows the collection of heterogeneous types of data, enabling new possibilities for sociological urban field studies. Sensor data such as those related to the location or the movements of the user can be used to investigate and gain insights into the time diaries’ answers and assess their overall quality. The key idea is that the users’ answers are used to train machine-learning algorithms, allowing the system to learn from the user’s habits and to generate new time diaries’ answers. In turn, these new labels can be used to assess the quality of existing ones, or to fill the gaps when the user does not provide an answer. The aim of this paper is to introduce the pilot study, the i-Log system and the methodological evidence that emerged during the survey.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Fernando Reis

Assessing the Quality of Home Detection from Mobile Phone Data for Official Statistics

Measuring output quality for multisource statistics in official statistics: Some directions

On a Modular Approach to the Design of Integrated Social Surveys

A toolbox for a modular design and pooled analysis of sample survey programmes

Improving Time Use Measurement with Personal Big Data Collection – The Experience of the European Big Data Hackathon 2019

Contact Info

Product

Resources

About