2023
DOI: 10.3390/electronics12132814
|View full text |Cite
|
Sign up to set email alerts
|

LLM-Informed Multi-Armed Bandit Strategies for Non-Stationary Environments

Abstract: In this paper, we introduce an innovative approach to handling the multi-armed bandit (MAB) problem in non-stationary environments, harnessing the predictive power of large language models (LLMs). With the realization that traditional bandit strategies, including epsilon-greedy and upper confidence bound (UCB), may struggle in the face of dynamic changes, we propose a strategy informed by LLMs that offers dynamic guidance on exploration versus exploitation, contingent on the current state of the bandits. We br… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 9 publications
0
4
0
Order By: Relevance
“…We also note that some related studies have applied MAS (multi-agent systems) [33][34][35], large language models (LLMs) [36][37][38], and visual language models (VLMs) [39] to robot navigation and guidance. They utilize sensors like cameras, laser scanners, or radar to gather detailed environmental data.…”
Section: Discussionmentioning
confidence: 99%
“…We also note that some related studies have applied MAS (multi-agent systems) [33][34][35], large language models (LLMs) [36][37][38], and visual language models (VLMs) [39] to robot navigation and guidance. They utilize sensors like cameras, laser scanners, or radar to gather detailed environmental data.…”
Section: Discussionmentioning
confidence: 99%
“…The usage of AI, and more specifically LLMs [11,12], in the scientific field has seen a surge in recent years. OpenAI's GPT-3, the predecessor to GPT-3.5 Turbo, has been utilized in various scientific domains [16][17][18][19]. These studies highlight the capability of LLMs to generate informative, contextually relevant content, and suggest the potential for their application in more specialized scientific tasks [20][21][22].…”
Section: Related Workmentioning
confidence: 99%
“…Ref. [28] propose an LLM-based strategy that enables adaptive balancing of exploration and exploitation. Ref.…”
Section: Related Workmentioning
confidence: 99%