Objectives To test the application of statistical methods to detect data fabrication in a clinical trial. Setting Data from two clinical trials: a trial of a dietary intervention for cardiovascular disease and a trial of a drug intervention for the same problem. Outcome measures Baseline comparisons of means and variances of cardiovascular risk factors; digit preference overall and its pattern by group. Results In the dietary intervention trial, variances for 16 of the 22 variables available at baseline were significally different, and 10 significant differences were seen in means for these variables. Some of these P values were extraordinarily small. Distributions of the final recorded digit were significantly different between the intervention and the control group at baseline for 14/22 variables in the dietary trial. In the drug trial, only five variables were available, and no significant differences between the groups for baseline values in means or variances or digit preference were seen. Conclusions Several statistical features of the data from the dietary trial are so strongly suggestive of data fabrication that no other explanation is likely.
Recent studies have pointed out the potential of the odd Fréchet family (or class) of continuous distributions in fitting data of all kinds. In this article, we propose an extension of this family through the so-called "Topp-Leone strategy", aiming to improve its overall flexibility by adding a shape parameter. The main objective is to offer original distributions with modifiable properties, from which adaptive and pliant statistical models can be derived. For the new family, these aspects are illustrated by the means of comprehensive mathematical and numerical results. In particular, we emphasize a special distribution with three parameters based on the exponential distribution. The related model is shown to be skillful to the fitting of various lifetime data, more or less heterogeneous. Among all the possible applications, we consider two data sets of current interest, linked to the COVID-19 pandemic. They concern daily cases confirmed and recovered in Pakistan from March 24 to April 28, 2020. As a result of our analyzes, the proposed model has the best fitting results in comparison to serious challengers, including the former odd Fréchet model.
In many areas of applied sciences, the last step of a study often consists in analyzing in depth the collected data. Among all the kinds of data, the lifetime data are well-known to convey a great deal of information whose capture is necessary to identify one or more key phenomena. In this regards, numerous mathematical models have been proposed, including those based on lifetime distributions. In this paper, we introduce a new four-parameter lifetime distribution based on the type II Topp-Leone-G family and the power Lomax distribution. In comparison to the existing distributions, the new one is characterized by very flexible probability functions: increasing, decreasing, J, and reverse J shapes are observed for the probability density and hazard rate functions, giving first signs on the potential of adaptability of the related model. With this idea in mind, the new distribution is studied in detail, from both the theoretical and applied sides. After showing its main mathematical properties, the related model is investigated with estimation of the parameters by the maximum likelihood method. We applied it to two practical datasets, including the well-know aircraft windshield data. We show that the new model performs better than several modern adversary models, motivating its use in an applied setting.
In order to reduce the dimensionality of parameter space and enhance out-of-sample forecasting performance, this research compares regularization techniques with Autometrics in time-series modeling. We mainly focus on comparing weighted lag adaptive LASSO (WLAdaLASSO) with Autometrics, but as a benchmark, we estimate other popular regularization methods LASSO, AdaLASSO, SCAD, and MCP. For analytical comparison, we implement Monte Carlo simulation and assess the performance of these techniques in terms of out-of-sample Root Mean Square Error, Gauge, and Potency. The comparison is assessed with varying autocorrelation coefficients and sample sizes. The simulation experiment indicates that, compared to Autometrics and other regularization approaches, the WLAdaLASSO outperforms the others in covariate selection and forecasting, especially when there is a greater linear dependency between predictors. In contrast, the computational efficiency of Autometrics decreases with a strong linear dependency between predictors. However, under the large sample and weak linear dependency between predictors, the Autometrics potency ⟶ 1 and gauge ⟶ α. In contrast, LASSO, AdaLASSO, SCAD, and MCP select more covariates and possess higher RMSE than Autometrics and WLAdaLASSO. To compare the considered techniques, we made the Generalized Unidentified Model for covariate selection and out-of-sample forecasting for the trade balance of Pakistan. We train the model on 1985–2015 observations and 2016–2020 observations as test data for the out-of-sample forecast.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.