Summary A three‐level M‐quantile model for small area estimation is proposed. The methodology represents an efficient alternative to prediction by using a three‐level linear mixed model in the presence of outliers and it is based on an extension of M‐quantile regression. A modified method of the traditional M‐quantile (two‐level) approach for poverty estimation is also proposed. In addition, an estimator of the mean‐squared prediction error is described, which is based on a bootstrap procedure. The methodology proposed, as well as the three‐level empirical best predictor, are applied to Polish European Union Survey on Income and Living Conditions and census data to estimate poverty at local administrative unit 1 level in Poland, i.e. the level for which the Central Statistical Office of Poland has not published any official estimates to date.
The European Survey on Income and Living Conditions (EU-SILC) is the basic source of information published by CSO (the Central Statistical Office of Poland) about the relative poverty indicator, both for the country as a whole and at the regional level (NUTS 1). Estimates at lower levels of the territorial division than regions (NUTS 1) or provinces (NUTS 2, also called 'voivodships') have not been published so far. These estimates can be calculated by means of indirect estimation methods, which rely on information from outside the subpopulation of interest, which usually increases estimation precision. The main aim of this paper is to show results of estimation of the poverty indicator at a lower level of spatial aggregation than the one used so far, that is at the level of subregions in Poland (NUTS 3) using the small area estimation methodology (SAE), i.e. a model-based technique -the EBLUP estimator based on the Fay-Herriot model. By optimally choosing covariates derived from sources unaffected by random errors we can obtain results with adequate precision. A territorial analysis of the scope of poverty in Poland at NUTS 3 level will be also presented in detail 4 . The article extends the approach presented by Wawrowski (2014).
Streszczenie: W artykule opisany został aktualny stan wykorzystania tzw. big data w statystyce oficjalnej. Przedstawione zostały doświadczenia wybranych -krajowych urzędów statystycznych w praktycznym zastosowaniu danych pochodzących od operatorów telefonii komórkowej, czujników ruchu, z portali społecznościowych czy danych transakcyjnych na potrzeby statystyki publicznej. Sformułowane zostały również szanse, wyzwania i zagroże-nia, jakie stoją przed urzędami statystycznymi w wykorzystaniu tego typu informacji w nurcie statystyki publicznej.Słowa kluczowe: big data, statystyka publiczna, internetowe źródła danych. Summary:The main purpose of the article is to describe the state of the art in using big data in official statistics. The article presents selected examples of how data from mobile operators, sensors, social media or scanners are used by national statistical offices. The authors also identify chances, challenges and risks related to the use of big data in the field of official statistics.
Market basket analysis, which is a method of discovering co-occurrence relationships, is widely used for the purposes of marketing research and e-commerce, mainly by supermarkets and online stores. Moving beyond the traditional notion of a market basket understood as a fixed list of products, the technique can be applied for data mining in other fields of research which do not involve traditional transactions and purchases made by customers. The following article describes theoretical aspects of market basket analysis with an illustrative application based on data from the National Census of Population and Housing 2011 with respect to marital status. This is the first application of market basket analysis to census data to be conducted in Poland, in which attributes of the market basket have been replaced with respondents' demographic characteristics. This approach makes it possible to identify relationships between legal (de jure) marital status and actual (de facto) marital status, taking into account other basic socio-demographic variables available in large datasets. Using the R software to generate choropleth maps classified by province as a method of visualizing association rules, it was possible to conduct a spatial analysis of the phenomenon of interest.
The aim of the research is to present the application of calibration approach in the tracer study of the Poznań University of Economics and Business graduates conducted within the ”Staff for Economy” project and partnered by the Statistical Office in Poznań. The obligation to carry out tracer studies was imposed on universities by the legal acts concerning the process of monitoring graduates professional careers. The most important problem in the research was nonresponse of monitored graduates, which influenced the results obtained due to non-random error. The use of calibration applied in representative surveys to correct weights resulting from the sampling scheme showed, by choosing the appropriate auxiliary variables, how the negative impact of the lack of response can be reduced in full graduates tracer study. The article presents in detail the scope of research and theoretical basis of calibration. It describes also the use of auxiliary variables in the creation of calibrated weights which could then be included in the tabulation process and graphical presentation of results. Additionally, the article raises the issues of assessing the precision of estimation results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.