In this paper, we propose a controller for a bicycle using the DDPG (Deep Deterministic Policy Gradient) algorithm, which is a state-of-the-art deep reinforcement learning algorithm. We use a reward function and a deep neural network to build the controller. By using the proposed controller, a bicycle can not only be stably balanced but also travel to any specified location. We confirm that the controller with DDPG shows better performance than the other baselines such as Normalized Advantage Function (NAF) and Proximal Policy Optimization (PPO). For the performance evaluation, we implemented the proposed algorithm in various settings such as fixed and random speed, start location, and destination location.
Objective image quality assessment (IQA) is imperative in the current multimedia-intensive world, in order to assess the visual quality of an image at close to a human level of ability. Many parameters such as color intensity, structure, sharpness, contrast, presence of an object, etc., draw human attention to an image. Psychological vision research suggests that human vision is biased to the center area of an image and display screen. As a result, if the center part contains any visually salient information, it draws human attention even more and any distortion in that part will be better perceived than other parts. To the best of our knowledge, previous IQA methods have not considered this fact. In this paper, we propose a full reference image quality assessment (FR-IQA) approach using visual saliency and contrast; however, we give extra attention to the center by increasing the sensitivity of the similarity maps in that region. We evaluated our method on three large-scale popular benchmark databases used by most of the current IQA researchers (TID2008, CSIQ and LIVE), having a total of 3345 distorted images with 28 different kinds of distortions. Our method is compared with 13 state-of-the-art approaches. This comparison reveals the stronger correlation of our method with human-evaluated values. The prediction-of-quality score is consistent for distortion specific as well as distortion independent cases. Moreover, faster processing makes it applicable to any real-time application. The MATLAB code is publicly available to test the algorithm and can be found online 1 . 1
Financial data suffer from missing, unlabeled and unbalanced data, thus weakening the performance of decision-making systems. In addition, the aim of financial institutions is not only to find decision-making models that achieve high scores for the standard metrics (e.g., AUC, accuracy, F-score) but to reduce the risk from miss-classification cases. This paper addresses these problems by proposing a novel framework inspired by reinforcement learning, specifically actor-critic, for decision-making and implementing generative adversarial networks for imputing missing data, as well as utilizing the unlabeled dataset. Moreover, by taking advantage of reinforcement learning, the trained model is calibrated using a customizable reward function, which can be designed for different purposes of financial institutions. We evaluate the framework via real-world financial datasets that only have a small amount of labeled data and exhibit missing data. Our experiment shows promising results where the financial risk is dramatically reduced without too much sacrifice on standard metrics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.