Reinforcement learning analysis for a minimum time balance problem

Tutsoy, Önder; Brown, M.

doi:10.1177/0142331215581638

Cited by 24 publications

(18 citation statements)

References 43 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, the proposed method performs higher accuracy and precision than the method with fixed resolution 640 × 480. Furthermore, the time consumption per image of the proposed method is less than that of conventional methods (Figure 7(b)), which is 44.1% of the method with fixed resolution 640 × 480 and 28.1% of the method with fixed resolution 1280 × 960. GBVS model and dense SIFT algorithm cost the majority of the whole time, and the higher the resolution, the more time they cost.…”

Section: Experiments and Resultsmentioning

confidence: 92%

“…Therefore, such situations result in low speed of image processing. Tutsoy and Brown 7 prove that a learning/classification algorithm with optimized control policy can precisely produce the same result as the analytical solution with the amount of data reduced. Besides, processing images with high resolution for object detection and recognition is not economical, because interesting objects do not always show up in FOV.…”

Section: Introductionmentioning

confidence: 92%

See 1 more Smart Citation

A variable resolution feedback improving the performances of object detection and recognition

Wang

Hao

Zhang

et al. 2017

Proceedings of the Institution of Mechanical Engineers, Part I:

View full text Add to dashboard Cite

In order to improve the performances of object detection and recognition, a two-stage framework combined with variable resolution control strategy is proposed. The images in low resolution and high resolution are employed in object detection and object recognition, respectively. Meanwhile, a feedback mechanism used in two-stage framework is proved to effectively improve the performances of object detection. The results show that under low resolution, the accuracy of object detection based on fixed resolution 320 3 240 without feedback mechanism is 36.7%. However, the accuracy of object detection of the proposed method based on variable resolution increases to 95.3%. Under high resolution, compared with the method based on fixed resolution 1280 3 960 using the two-stage framework, the time consumption of the proposed method with variable resolution decreases by 51.4%, while keeping almost identical recognition accuracy.

show abstract

Section: Experiments and Resultsmentioning

confidence: 92%

Section: Introductionmentioning

confidence: 92%

A variable resolution feedback improving the performances of object detection and recognition

Wang

Hao

Zhang

et al. 2017

Proceedings of the Institution of Mechanical Engineers, Part I:

View full text Add to dashboard Cite

show abstract

“…A concrete test problem with a closed-form solution is a good way to evaluate the performance of algorithms in detail (Tutsoy and Brown, 2016a , b ). Here, we used the existing and proposed algorithms to solve a simple problem.…”

Section: Methodsmentioning

confidence: 99%

Incremental and Parallel Machine Learning Algorithms With Automated Learning Rate Adjustments

Hishinuma

Iiduka

2019

Front. Robot. AI

View full text Add to dashboard Cite

The existing machine learning algorithms for minimizing the convex function over a closed convex set suffer from slow convergence because their learning rates must be determined before running them. This paper proposes two machine learning algorithms incorporating the line search method, which automatically and algorithmically finds appropriate learning rates at run-time. One algorithm is based on the incremental subgradient algorithm, which sequentially and cyclically uses each of the parts of the objective function; the other is based on the parallel subgradient algorithm, which uses parts independently in parallel. These algorithms can be applied to constrained nonsmooth convex optimization problems appearing in tasks of learning support vector machines without adjusting the learning rates precisely. The proposed line search method can determine learning rates to satisfy weaker conditions than the ones used in the existing machine learning algorithms. This implies that the two algorithms are generalizations of the existing incremental and parallel subgradient algorithms for solving constrained nonsmooth convex optimization problems. We show that they generate sequences that converge to a solution of the constrained nonsmooth convex optimization problem under certain conditions. The main contribution of this paper is the provision of three kinds of experiment showing that the two algorithms can solve concrete experimental problems faster than the existing algorithms. First, we show that the proposed algorithms have performance advantages over the existing ones in solving a test problem. Second, we compare the proposed algorithms with a different algorithm Pegasos, which is designed to learn with a support vector machine efficiently, in terms of prediction accuracy, value of the objective function, and computational time. Finally, we use one of our algorithms to train a multilayer neural network and discuss its applicability to deep learning.

show abstract

“…A few works [46,47] have also explored ideas from Probably Approximately Correct (PAC) learning framework to reuse previously discarded experiences intelligently to accelerate the value function learning. Recently, in [40], the authors have analyzed the rate of parameter convergence for RL algorithms in presence of unstable system dynamics and random exploration noise, thus showing the significant potential of accelerating the learning process.…”

Section: Introductionmentioning

confidence: 99%

Alera

Banerjee

Chatterjee

2019

ACM Trans. Intell. Syst. Technol.

View full text Add to dashboard Cite

The successful deployment of autonomous real-time systems is contingent on their ability to recover from performance degradation of sensors, actuators, and other electro-mechanical subsystems with low latency. In this article, we introduce ALERA, a novel framework for real-time control law adaptation in nonlinear control systems assisted by system state encodings that generate an error signal when the code properties are violated in the presence of failures. The fundamental contributions of this methodology are twofold-first, we show that the time-domain error signal contains perturbed system parameters' diagnostic information that can be used for quick control law adaptation to failure conditions and second, this quick adaptation is performed via reinforcement learning algorithms that relearn the control law of the perturbed system from a starting condition dictated by the diagnostic information, thus achieving significantly faster recovery. The fast (up to 80X faster than traditional reinforcement learning paradigms) performance recovery enabled by ALERA is demonstrated on an inverted pendulum balancing problem, a brake-by-wire system, and a self-balancing robot. CCS Concepts: • Computer systems organization → Maintainability and maintenance; Embedded and cyber-physical systems; • Computing methodologies → Markov decision processes;

show abstract

Reinforcement learning analysis for a minimum time balance problem

Cited by 24 publications

References 43 publications

A variable resolution feedback improving the performances of object detection and recognition

A variable resolution feedback improving the performances of object detection and recognition

Incremental and Parallel Machine Learning Algorithms With Automated Learning Rate Adjustments

Alera

Contact Info

Product

Resources

About