Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Com 2009
DOI: 10.3115/1620754.1620794
|View full text |Cite
|
Sign up to set email alerts
|

Predicting risk from financial reports with regression

Abstract: We address a text regression problem: given a piece of text, predict a real-world continuous quantity associated with the text's meaning. In this work, the text is an SEC-mandated financial report published annually by a publiclytraded company, and the quantity to be predicted is volatility of stock returns, an empirical measure of financial risk. We apply wellknown regression techniques to a large corpus of freely available financial reports, constructing regression models of volatility for the period followi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
153
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 171 publications
(161 citation statements)
references
References 31 publications
1
153
0
Order By: Relevance
“…Forecasting from text requires identifying textual correlates of a response variable revealed in the future, most of which will be weak and many of which will be spurious (Kogan et al, 2009). We consider two such problems.…”
Section: Sentiment Analysismentioning
confidence: 99%
“…Forecasting from text requires identifying textual correlates of a response variable revealed in the future, most of which will be weak and many of which will be spurious (Kogan et al, 2009). We consider two such problems.…”
Section: Sentiment Analysismentioning
confidence: 99%
“…The financial dataset consists of 19,395 10-K reports (data available at http://www.ark.cs.cmu.edu/10K/), the annual revenue reports of publicly-traded corporations required by Securities Exchange Commission, published over the period of 1996-2006 from 10,492 companies [7]. Feature extraction is performed on the raw text files using TFIDF, and produces 150,360 features (sparse) for each instance.…”
Section: Application 1 -Financial Risk Rankingmentioning
confidence: 99%
“…Due to the strong relation between the textual information and numerical measures, there has been a growing body of studies in the fields of finance and data science that adopt the techniques of natural language processing (NLP) and machine learning to examine the interaction between these two types of information (e.g., Kogan et al, 2009;Tsai and Wang, 2017;Rekabsaz et al, 2017). For example, Loughran and McDonald (2011) and Jegadeesh and Wu (2013) investigate how the disclosures of finance sentiment or risk keywords in SEC-mandated financial reports affect investor expectations about a company's future stock prices.…”
Section: Introductionmentioning
confidence: 99%
“…For example, Loughran and McDonald (2011) and Jegadeesh and Wu (2013) investigate how the disclosures of finance sentiment or risk keywords in SEC-mandated financial reports affect investor expectations about a company's future stock prices. Moreover, Kogan et al (2009) and Tsai and Wang (2017) exploit sentiment analysis of 10-K reports for financial risk analysis. Furthermore, in Liu et al (2016), a web-based information system, FIN10K, is proposed for financial report analysis and visualization.…”
Section: Introductionmentioning
confidence: 99%