This chapter describes the basic mechanics for building a forecasting model that uses as input sentiment indicators derived from textual data. In addition, as we focus our target of predictions on financial time series, we present a set of stylized empirical facts describing the statistical properties of lexicon-based sentiment indicators extracted from news on financial markets. Examples of these modeling methods and statistical hypothesis tests are provided on real data. The general goal is to provide guidelines for financial practitioners for the proper construction and interpretation of their own time-dependent numerical information representing public perception toward companies, stocks’ prices, and financial markets in general.
Background
The main goal of this work is to estimate the actual number of cases of Covid-19 in Spain in the period 01-31-2020/06-01-2020 by Autonomous Communities. Based on these estimates, this work allows us to accurately re-estimate the lethality of the disease in Spain, taking into account unreported cases.
Methods
A hierarchical Bayesian model recently proposed in the literature has been adapted to model the actual number of Covid-19 cases in Spain.
Results
The results of this work show that the real load of Covid-19 in Spain in the period considered is well above the data registered by the public health system. Specifically, the model estimates show that, cumulatively until June 1st, 2020, there were 2 425 930 cases of Covid-19 in Spain with characteristics similar to those reported (95% credibility interval: 2 148 261 2 813 864), from which were actually registered only 518 664.
Conclusions
Considering the results obtained from the second wave of the Spanish seroprevalence study, which estimates 2 350 324 cases of Covid-19 produced in Spain, in the period of time considered, it can be seen that the estimates provided by the model are quite good. This work clearly shows the key importance of having good quality data to optimize decision-making in the critical context of dealing with a pandemic.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.