Extracting sentiment from text is a hard semantic problem. We develop a methodology for extracting small investor sentiment from stock message boards. The algorithm comprises different classifier algorithms coupled together by a voting scheme. Accuracy levels are similar to widely used Bayes classifiers, but false positives are lower and sentiment accuracy higher. Time series and cross-sectional aggregation of message information improves the quality of the resultant sentiment index, particularly in the presence of slang and ambiguity. Empirical applications evidence a relationship with stock values--tech-sector postings are related to stock index levels, and to volumes and volatility. The algorithms may be used to assess the impact on investor opinion of management announcements, press releases, third-party news, and regulatory changes.text classification, index formation, computers-computer science, artificial intelligence, finance, investment
Common Failings: How Corporate Defaults are CorrelatedWe develop, and apply to data on U.S. corporations from 1979-2004, tests of the standard doubly-stochastic assumption under which firms' default times are correlated only as implied by the correlation of factors determining their default intensities. This assumption is violated in the presence of contagion or "frailty" (unobservable explanatory variables that are correlated across firms). Our tests do not depend on the time-series properties of default intensities. The data do not support the joint hypothesis of well specified default intensities and the doubly-stochastic assumption. There is also some evidence of default clustering in excess of that implied by the doubly-stochastic model with the given intensities.How corporate defaults are correlated 1
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.