2019
DOI: 10.2139/ssrn.3444996
|View full text |Cite
|
Sign up to set email alerts
|

Dealing with the Log of Zero in Regression Models

Abstract: Log-linear and log-log regressions are one of the most used statistical model. However, handling zeros in the dependent and independent variable has remained obscure despite the prevalence of the situation. In this paper, we discuss how to deal with this issue. We show that using Pseudo-Poisson Maximum Likelihood (PPML) is a good practice compared to other approximate solutions. We then introduce a new complementary solution to deal with zeros consisting in adding a positive value specific to each observation … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
16
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(22 citation statements)
references
References 54 publications
(14 reference statements)
1
16
0
Order By: Relevance
“…The same was true when we used unemployment as the dependent variable with the sole exception of using the most conservative approach of clustering by the 34 NUTS2 areas. 34 Fourth, we considered regression discontinuity (RD) designs (see online Appendix Sections F. 2 through F.4). In principle, since we know the variables underlying the rules, conditioning on polynomials of the rules should remove the correlation of NGE with unobservable influences on our outcomes.…”
Section: Other Area-level Robustness Testsmentioning
confidence: 99%
See 1 more Smart Citation
“…The same was true when we used unemployment as the dependent variable with the sole exception of using the most conservative approach of clustering by the 34 NUTS2 areas. 34 Fourth, we considered regression discontinuity (RD) designs (see online Appendix Sections F. 2 through F.4). In principle, since we know the variables underlying the rules, conditioning on polynomials of the rules should remove the correlation of NGE with unobservable influences on our outcomes.…”
Section: Other Area-level Robustness Testsmentioning
confidence: 99%
“…For example, the coefficient (standard error) on NGE in the IV regression is 1.295 (0.325). 34 Another issue is that since the instruments are generated regressors (from Table 3), formally we should allow for this in the calculation of the variance-covariance matrix. Doing so, however, made very little difference to the results as shown, for example, in online Appendix Table A8.…”
Section: Other Area-level Robustness Testsmentioning
confidence: 99%
“…The logarithm of the size of grid cells was used as an offset in the linear predictor in order to make counts comparable between grid cells of different sizes (at the edges of the study area, some parts of the grid cells fell outside of the fenced area). We log-transformed grazer densities and therefore replaced values of zero (i.e., zero observations in a grid cell) with half of the minimum non-zero value (Bellégo & Pape, 2019). The log-transformed grazer density was used as covariate, and bird species was included as a random factor.…”
Section: Resultsmentioning
confidence: 99%
“…There are three simple options: removing zeros, adding 0.5 to all observed data, or adding 1 to all data. For a general (not just for the lognormal) best solution we refer to a recent working paper by Bellégo and Pape (2019).…”
Section: The Lognormal Distribution Describes Two Totally Different Citation Curvesmentioning
confidence: 99%