We propose a probabilistic framework to directly insert prior knowledge in reinforcement learning (RL) algorithms by defining the behaviour policy as a Bayesian posterior distribution. Such a posterior combines task-specific information with prior knowledge, thus allowing to achieve transfer learning across tasks. The resulting method is flexible and it can be easily incorporated to any standard off-policy and on-policy algorithms, such as those based on temporal differences and policy gradients. We develop a specific instance of this Bayesian transfer RL framework by expressing prior knowledge as general deterministic rules that can be useful in a large variety of tasks, such as navigation tasks. Also, we elaborate more on recent probabilistic and entropy-regularised RL by developing a novel temporal learning algorithm and show how to combine it with Bayesian transfer RL. Finally, we demonstrate our method for solving mazes and show that significant speed ups can be obtained.
PurposeUsing textual analysis the authors study the relationship between social media sentiments and stock markets during the COVID-19 pandemic.Design/methodology/approachThe study analysis is based on a sample of 1,616,007 tweets over the period January to June 2021 for seven countries. The authors process the tweets via the VADER analyzer thereby producing both positive and negative sentiment measures.FindingsParticularly, the authors prove that higher positivism is associated with a short-term increase in stock prices. On the other side, negativism relates inversely to stock prices with long-term impact, in the case of English-spoken countries. Notably, the study results remain robust to the inclusion of various control variables, including virtual fear and Google vaccine indexes. Finally, the authors prove that positivism is associated with higher returns and lower volatility in the short-run, while negativism is linked with lower returns in the short run.Practical implicationsThe study analysis also has significant policy implications for researchers, investors and policymakers. First, researchers can employ our measures to quantify market sentiments and expand their research arsenal to incorporate social media trends, thus providing better explanatory power. Second, during times of severe uncertainty such as in a pandemic period, investors could beneficially take into account our textual measures and empirical results when using asset pricing models or constructing their portfolios. Third, the finding that the stock market is heavily governed by sentimental behaviors, especially during crisis periods, implies that policymakers including central banks, governments and capital market commissions must consider these sentiments before exerting their policies. In this regard, governments can effectively develop policy tools and approaches to manage recovery from the pandemic, which translates to greater long-term economic resilience. Moreover, central banks should accordingly adjust their monetary policy measures in order to stabilize financial markets, and by extension, to stop the pandemic from turning into a renewed financial crisis. For example, asset purchase program is considered the main instrument of this kind of intervention.Originality/valueThe authors confirm that this work is original and has not been published elsewhere, nor is it currently under consideration for publication elsewhere. The paper should be of interest to readers in the areas of finance.
In Federated Learning (FL), datasets across clients tend to be heterogeneous or personalized, and this poses challenges to the convergence of standard FL schemes that do not account for personalization. To address this, we present a new approach for personalized FL that achieves exact stochastic gradient descent (SGD) minimization. We start from the FedPer (Arivazhagan et al., 2019) neural network (NN) architecture for personalization, whereby the NN has two types of layers: the first ones are the common layers across clients, while the few final ones are client-specific and are needed for personalization. We propose a novel SGD-type scheme where, at each optimization round, randomly selected clients perform gradient-descent updates over their client-specific weights towards optimizing the loss function on their own datasets, without updating the common weights. At the final update, each client computes the joint gradient over both client-specific and common weights and returns the gradient of common parameters to the server. This allows to perform an exact and unbiased SGD step over the full set of parameters in a distributed manner, i.e. the updates of the personalized parameters are performed by the clients and those of the common ones by the server. Our method is superior to Fe-dAvg and FedPer baselines in multi-class classification benchmarks such as Omniglot, CIFAR-10, MNIST, Fashion-MNIST, and EMNIST and has much lower computational complexity per round.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.