2013
DOI: 10.1111/j.1542-4774.2012.01110.x
|View full text |Cite
|
Sign up to set email alerts
|

Proxying for Unobservable Variables With Internet Document-Frequency

Abstract: The internet contains billions of documents. We show that document frequencies in large decentralized textual databases can capture the cross-sectional variation in the occurrence frequencies of social phenomena. We characterize the econometric conditions under which such proxying is likely. We also propose using recently-introduced internet search volume indexes as proxies for fundamental locational traits, and discuss their advantages and limitations. We then successfully proxy for a number of economic and d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
25
0
2

Year Published

2014
2014
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 57 publications
(31 citation statements)
references
References 43 publications
1
25
0
2
Order By: Relevance
“…(Our interpretation would still account for this shift) In Columns 1 and 2 of Table 9 we replicate the baseline time-series and diff-in-diff regressions in a weighted OLS regression, with the addition of firm fixed effects. 26 The results are nearly identical to the ones in the benchmark specifications, indicating that the results are not due to a compositional shift.…”
Section: Estimatessupporting
confidence: 74%
See 3 more Smart Citations
“…(Our interpretation would still account for this shift) In Columns 1 and 2 of Table 9 we replicate the baseline time-series and diff-in-diff regressions in a weighted OLS regression, with the addition of firm fixed effects. 26 The results are nearly identical to the ones in the benchmark specifications, indicating that the results are not due to a compositional shift.…”
Section: Estimatessupporting
confidence: 74%
“…A more important role is played by weighting by total advertising spending, since the results are larger for sectors with higher spending on television advertising. We also document that the results are similar when using an alternative measure of regulation based upon the occurrence of internet content of industry names together with words indicating regulation, as developed in Saiz and Simonsohn (2013). 4 We also test for a dynamic version of the quid-pro-quo.…”
Section: Introductionmentioning
confidence: 96%
See 2 more Smart Citations
“…First, we follow Saiz and Simonsohn (2013) in building a measure from an online search, using the Exalead tool, for the term "corruption" close to the name of each state (performed in 2009).…”
mentioning
confidence: 99%