Abstract:The Bag of Visual Words (BoVW) is an established representation in computer vision. Taking inspiration from text mining, this representation has proved to be very effective in many domains. However, in most cases, standard term-weighting schemes are adopted (e.g., term-frequency or tf-idf). It remains open the question of whether alternative weighting schemes could boost the performance of methods based on BoVW. More importantly, it is unknown whether it is possible to automatically learn and determine effecti… Show more
“…visual words to better represent the images and extracting only what is significant for understanding the semantics of the image. An image could have some visual words that are not significant to understand an image [9,10]. So we have created a mechanism to filter the insignificant visual words based on the textual annotation.…”
“…visual words to better represent the images and extracting only what is significant for understanding the semantics of the image. An image could have some visual words that are not significant to understand an image [9,10]. So we have created a mechanism to filter the insignificant visual words based on the textual annotation.…”
“…Finally, all unigrams, bigrams and trigrams were identified in the training data and ranked according to their weights. Therefore, one central issue to be addressed is the choice of an appropriate termweighting scheme to evaluate how important a word is within a document in a corpus [15,54].…”
Automated textual analysis of firm-related documents has become an important decision support tool for stock market investors. Previous studies tended to adopt either dictionary-based or machine learning approach. Nevertheless, little is known about their concurrent use. Here we use the combination of financial indicators, readability, sentiment categories and bag-of-words (BoW) to increase prediction accuracy. This paper aims to extract both sentiment and BoW information from the annual reports of U.S. firms. The sentiment analysis is based on two commonly used dictionaries, namely a general dictionary Diction 7.0 and a finance-specific dictionary proposed by Loughran and McDonald [1]. The BoW are selected according to their tf-idf. We combine these features with financial indicators to predict abnormal stock returns using a multi-layer perceptron neural network with dropout regularization and rectified linear units. We show that this method performs similarly as Naïve Bayes and outperforms other machine learning algorithms (Support Vector Machine, C4.5 decision tree, and k-nearest neighbour classifier) in predicting positive/negative abnormal stock returns in terms of ROC. We also show that the quality of the prediction significantly increased when using the correlation-based feature selection of BoW. This prediction performance is robust to industry categorization and event window.
“…Therefore, following the taxonomy illustrated by Talbi in [24], our proposed work can be described as a low-level teamwork hybridisation. Concerning the works reported in the literature between machine learning and metaheuristics [8,25], it is well-known that this relationship is not a one-way street, we do not have only approaches were machine learning techniques assist and enhance metaheuristics, but also the other way around: machine learning models improved by metaheuristics, is a much consolidated group in the hybridisation field [26][27][28][29][30]. This paper is concerned with the first group, where novel approaches have been proposed, such as [31], where a diversification-based learning (DBL) framework is proposed.…”
The idea of hybrid approaches have become a powerful strategy for tackling several complex optimisation problems. In this regard, the present work is concerned with contributing with a novel optimisation framework, named learning-based linear balancer (LB2). A regression model is designed, with the objective to predict better movements for the approach and improve the performance. The main idea is to balance the intensification and diversification performed by the hybrid model in an online-fashion. In this paper, we employ movement operators of a spotted hyena optimiser, a modern algorithm which has proved to yield good results in the literature. In order to test the performance of our hybrid approach, we solve 15 benchmark functions, composed of unimodal, multimodal, and mutimodal functions with fixed dimension. Additionally, regarding the competitiveness, we carry out a comparison against state-of-the-art algorithms, and the sequential parameter optimisation procedure, which is part of multiple successful tuning methods proposed in the literature. Finally, we compare against the traditional implementation of a spotted hyena optimiser and a neural network approach, the respective statistical analysis is carried out. We illustrate experimental results, where we obtain interesting performance and robustness, which allows us to conclude that our hybrid approach is a competitive alternative in the optimisation field.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.