Outlier detection is a challenging task especially when outliers are defined by rare combinations of multiple variables. In this paper, we develop and evaluate a new method for the detection of outliers in multivariate data that relies on Principal Components Analysis (PCA) and three-sigma limits. The proposed approach employs PCA to effectively perform dimension reduction by regenerating variables, i.e., fitted points from the original observations. The observations lying outside the three-sigma limits are identified as the outliers. This proposed method has been successfully employed to two real life and several artificially generated datasets. The performance of the proposed method is compared with some of the existing methods using different performance evaluation criteria including the percentage of correct classification, precision, recall, and F-measure. The supremacy of the proposed method is confirmed by abovementioned criteria and datasets. The F-measure for the first real life dataset is the highest, i.e., 0.6667 for the proposed method and 0.3333 and 0.4000 for the two existing approaches. Similarly, for the second real dataset, this measure is 0.8000 for the proposed approach and 0.5263 and 0.6315 for the two existing approaches. It is also observed by the simulation experiments that the performance of the proposed approach got better with increasing sample size.
A popular robust estimation technique for linear models is the rank-based method as an alternative to the ordinary least square (OLS) and restricted maximum likelihood (REML) in the presence of extreme observations. This method is applied in machine reliability analysis and quantum engineering, especially in artificial intelligence and optimization problems where outliers are commonly observed. This technique is also extended for the multilevel model, where the shape of error distribution contributes a significant role in more efficient estimation. In this study, we proposed the Weibull score function for the Weibull distributed error terms in the multilevel model. The efficiency of the proposed score function is compared with the existing Wilcoxon score function and the traditional method REML via Monte Carlo simulations after adding simulated extreme observations. For small values of shape parameter in Weibull distribution of error term showing the presence of outliers, the Weibull score function was found to be efficient as compared to the Wilcoxon and REML methods. However, for a large value of shape parameter, Wilcoxon score appeared either equally efficient than the Weibull score function. REML is observed least precise in all situations. These findings are verified through a real application on test scores data, with a small value of shape parameter, and the Weibull score function turned out the most efficient.
Aim: To determine the factors that influence the stunting level of children under the age of five years in Pakistan. Methods: This study was conducted using Pakistan Demographic and Health Survey (PDHS) 2017-2018 during 2020-2021.The response variable comprised two categories: stunted and not stunted. In this study the demographic and socioeconomic factors affecting stunting are region, birthplace, preceding birth interval, women's education level, husband\partner`s education level, women's age, breast feeding, size of child at birth, total child ever born, type of place of residence, frequency of listening to the radio, sources of drinking facilities, and antenatal visits. A binary logistic regression model was applied to access the relationship between stunting with potential demographic and socioeconomic factors. Results: The binary logistic regression model identified that the significant factors for stunting of children in the regions of Pakistan are: Punjab (OR=.311, CI; 0.104, 0.934),KPK(OR=0.278,CI; 0.091,0.853), mother education(secondary OR=2.671,CI; 1.025,6.959),father education (Secondary OR=0.370, CI;0.146, 0.938),breastfeeding (1-year OR=0.197, CI; 0.056,0.689), child size (larger than average OR=0.113, CI; 0.020,0.646) and (average OR=0.212, CI;0.047,0.962). Practical implication: Identifying the determinants of stunting can lead to improved health outcomes for children, including reduced mortality rates, better cognitive development, and improved physical growth. Conclusion: This study discovered that stunting in Pakistan can be reduced by improving the education level of parents, proper breastfeeding, and proper diet during pregnancy duration. Keywords: Stunting, Binary logistic model, children, parent’s education
The purpose of this article is to examine household saving behavior in urban and rural areas of Pakistan. The study obtained microdata from the Household Integrated Economic Survey (HIES) 2018-19 and Pakistan Social Living Standards Measurements Survey (PSLM) 2018-19 conducted by the Pakistan Bureau of Statistics (PBS). A nationally representative sample of 5499 households is selected, 3155 from rural areas and 2344 from urban areas by using systematic sampling. The impact of socioeconomic and demographic characteristics on household saving behavior is investigated by applying a multiple linear regression model through Ordinary Least Squares (OLS) estimation method. A strong relationship between household saving behavior and socioeconomic and demographic characteristics is observed. Income has a positive impact on household savings, but age, dependency ratio, and family size have a negative effect. Furthermore, it is found that as the household income rises, their savings rise as well. Although, people residing in rural areas tend to save more amount contrary to urban households. However, saving rates of household with large families exhibited a decline in saving. Government should introduce new saving schemes in banks and reduce non-development expenditures for productive plans. It will provide motivation for domestic saving and an upsurge in employment opportunities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.