Investors are constantly aware of the behaviour of stock markets. This affects their emotions and motivates them to buy or sell shares. Financial sentiment analysis allows us to understand the effect of social media reactions and emotions on the stock market and vice versa. In this research, we analyse Twitter data and important worldwide financial indices to answer the following question: How does the polarity generated by Twitter posts influence the behaviour of financial indices during pandemics? This study is based on the financial sentiment analysis of influential Twitter accounts and its relationship with the behaviour of important financial indices. To carry out this analysis, we used fundamental and technical financial analysis combined with a lexicon-based approach on financial Twitter accounts. We calculated the correlations between the polarities of financial market indicators and posts on Twitter by applying a date shift on tweets. In addition, correlations were identified days before and after the existing posts on financial Twitter accounts. Our findings show that the markets reacted 0 to 10 days after the information was shared and disseminated on Twitter during the COVID-19 pandemic and 0 to 15 days after the information was shared and disseminated on Twitter during the H1N1 pandemic. We identified an inverse relationship: Twitter accounts presented reactions to financial market behaviour within a period of 0 to 11 days during the H1N1 pandemic and 0 to 6 days during the COVID-19 pandemic. We also found that our method is better at detecting highly shifted correlations by using SenticNet compared with other lexicons. With SenticNet, it is possible to detect correlations even on the same day as the Twitter posts. The most influential Twitter accounts during the period of the pandemic were The New York Times, Bloomberg, CNN News and Investing.com, presenting a very high correlation between sentiments on Twitter and stock market behaviour. The combination of a lexicon-based approach is enhanced by a shifted correlation analysis, as latent or hidden correlations can be found in data.
Please cite this article in press as: J. Cervantes, et al., Data selection based on decision tree for SVM classification on large data sets, Appl. Soft Comput. J. (2015), http://dx.a b s t r a c t Support Vector Machine (SVM) has important properties such as a strong mathematical background and a better generalization capability with respect to other classification methods. On the other hand, the major drawback of SVM occurs in its training phase, which is computationally expensive and highly dependent on the size of input data set. In this study, a new algorithm to speed up the training time of SVM is presented; this method selects a small and representative amount of data from data sets to improve training time of SVM. The novel method uses an induction tree to reduce the training data set for SVM, producing a very fast and high-accuracy algorithm. According to the results, the proposed algorithm produces results with similar accuracy and in a faster way than the current SVM implementations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.