As national statistical offices (NSOs) modernize, interest in integrating machine learning (ML) into official statisticians’ toolbox is growing. Two challenges to such an integration are the potential loss of transparency from using “black-boxes” and the need to develop a quality framework. In 2019, the High-Level Group for the Modernisation of Official Statistics (HLG-MOS) launched a project on machine learning with one of the objectives being to address these two challenges. One of the outputs of the HLG-MOS project is a Quality Framework for Statistical Algorithms (QF4SA). While many quality frameworks exist, they have been conceived with traditional methods in mind, and they tend to target statistical outputs. Currently, machine learning methods are being looked at for use in processes producing intermediate outputs, which lead to a final statistical output. Therefore, the QF4SA does not replace existing quality frameworks; it complements them. As the QF4SA targets intermediate outputs and not necessarily the final statistical output, it should be used in conjunction with existing quality frameworks to ensure that high-quality outputs are produced. This paper presents the QF4SA, as well as some recommendations for NSOs considering the use of machine learning in the production of official statistics.
Most National Statistical Institutes are progressively moving from traditional production models to new strategies based on the combined use of different sources of information, which can be both primary and secondary. In this article, we propose a framework for assessing the quality of multisource processes, such as statistical registers. The final aim is to develop a tool supporting decisions about the process design and its monitoring, and to provide quality measures of the whole production. The starting point is the adaptation of the life-cycle paradigm, that results in a three-phases framework described in recent literature. An evolution of this model is proposed, focusing on the first two phases of the life-cycle, to better represent the source integration/combination phase, that can vary accordingly to the features of different types of processes. The proposed enhancement would improve the existing quality framework to support the evaluation of different multisource processes. An application of the proposed framework to two Istat (Italian national statistical institute) registers in the economic area taken as case studies is presented. These experiences show the potentials of such tool in supporting National Statistical Institutes in assessing multisource statistical production processes.
The Frame SBS is a statistical register which has been developed at the Italian National Statistical Institute to support the annual estimation of structural business statistics (SBS). Actually, a number of core SBS are estimated by combining microdata directly supplied by different administrative sources. In this context, more accurate estimates for those SBS that are not covered by administrative sources can be obtained through small area estimation (SAE). In this article, we illustrate an application of SAE methods in the framework of the Frame SBS register in order to assess the potential advantages that can be achieved in terms of increased quality and reliability of the target variables. Different types of auxiliary information and approaches are compared in order to identify the optimal estimation strategy in terms of precision of the estimates.
In recent years, the Italian national institute of statistics (Istat), together with most National Statistical Institutes, is progressively moving from traditional production models based on the use of primary source of information - represented by direct surveys - to new production strategies based on the combined use of different primary and secondary sources of information. As result, new multisource statistical processes have been built, that guarantee a major improvement of both amount and quality of information about several phenomena of public interest. In this context, the Total Process Error (TPE) framework has been recently proposed in literature for assessing the quality of multisource processes. The TPE framework represents an evolution of the Zhang’s two-phase life-cycle approach and it additionally includes an operational tool to connect the steps of the multisource production process to the phases of the quality evaluation framework. TPE framework can be used both to support a multisource process design and to monitor an entire production process, in order to provide key elements to assess the quality of both the processes and their statistical outputs. In the present work, we describe as a case study in the new context of Istat production of official statistics the use of the TPE framework to support the process design of the Register for Public Administrations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.