2017
DOI: 10.4236/ojop.2017.62006
|View full text |Cite
|
Sign up to set email alerts
|

A Novel Approach Based on Reinforcement Learning for Finding Global Optimum

Abstract: A novel approach to optimizing any given mathematical function, called the MOdified REinforcement Learning Algorithm (MORELA), is proposed. Although Reinforcement Learning (RL) is primarily developed for solving Markov decision problems, it can be used with some improvements to optimize mathematical functions. At the core of MORELA, a sub-environment is generated around the best solution found in the feasible solution space and compared with the original environment. Thus, MORELA makes it possible to discover … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0
2

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 22 publications
(28 reference statements)
0
4
0
2
Order By: Relevance
“…The RL approach exhibits a higher probability of finding a global optimum than existing heuristic optimization algorithms owing to its search and reward characteristics. The MORELA [ 28 ] is a global optimum-finding algorithm, which is based on the model-free Q -learning-based RL approach. One advantage of the MORELA is the use of a sub-environment that is generated around the best solution determined in the previous learning step, and it plays an important role in the prevention of falling into a local optima by searching around the best solution.…”
Section: Optimization Algorithm Based On Reinforcement Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…The RL approach exhibits a higher probability of finding a global optimum than existing heuristic optimization algorithms owing to its search and reward characteristics. The MORELA [ 28 ] is a global optimum-finding algorithm, which is based on the model-free Q -learning-based RL approach. One advantage of the MORELA is the use of a sub-environment that is generated around the best solution determined in the previous learning step, and it plays an important role in the prevention of falling into a local optima by searching around the best solution.…”
Section: Optimization Algorithm Based On Reinforcement Learningmentioning
confidence: 99%
“…To the best of our knowledge, this is the first wideband NUSLA optimization approach using RL. A global minimum finding algorithm based on RL, known as the modified reinforcement learning algorithm (MORELA) [ 28 ], presents a significant advantage over existing heuristic algorithms. The algorithm is less insensitive to the hyper-parameter setting, demonstrates a higher probability of finding the global optimum, and is more efficient for high-dimensional cost functions.…”
Section: Introductionmentioning
confidence: 99%
“…여기서 는 페널티 계수(penalty coefficient)이다. 본 논문에서 제안한 비용 함수를 최소화하는 광대역 NUSLA의 안테나 배열과 가중치를 구하기 위해서 강화 학습 기반의 휴리스틱 최적화 알고리즘인 MORELA [17] 를 사용하였다.…”
Section: 본 논문에서는 그림unclassified
“…강화학습 기반의 휴리스틱 최적화 알고리즘으로, 강화학 습의 특성인 탐색(search)과 보상(reward)으로 기존 휴리스 틱 최적화 알고리즘보다 global optimum을 찾을 확률이 높은 장점이 있다. MORELA는 model-free Q-learning 기반 의 강화학습 알고리즘으로, 이전 단계의 best solution을 중심으로 특정 범위를 갖는 하위 환경(sub-environment)을 탐색하여 local optimum에 빠질 확률이 낮은 장점이 있어 기존의 휴리스틱 최적화 알고리즘을 뛰어넘는 성능을 갖 는다[17] . 특정 범위 내의 모든 주파수에서 널을 생성하는 광대역 NUSLA를 설계하기 위해, 본 논문에서 제안한 비 용 함수를 최소화하는 최적의 안테나 배열과 가중치를 MORELA를 사용하여 찾았다.…”
unclassified
“…Moreover increasing data flow rate, avoiding accidents at domestic area, imposing node to move with modest velocity for minimal fuel ingesting were done [7]. Modified reinforcement algorithm is introduced with mathematical function to improve the performance of reinforcement learning in the parameters like average objective function and average no of learning episodes with various dimensions [8]. For large distance travel, time prediction is an effective method for managing the traffic, in [9] Gradient Boosting [GB] method is introduced in this work for time prediction.…”
Section: Introductionmentioning
confidence: 99%