A Strongly Asymptotically Optimal Agent in General Environments

Cohen, Michael K.; Catt, Elliot; Hutter, Marcus

doi:10.48550/arxiv.1903.01021

Cited by 2 publications

(3 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The API allows anyone to design their demos based on existing agents and environments, and for new agents and environments to be added and interfaced into the system. There has been some related work in adapting GRL results to a practical setting [Cohen et al, 2019;Lamont et al, 2017] that successfully implemented an AIXI model using a Monte Carlo Tree Search planning algorithm. As far as we are aware, theoretical predictions in the context of wireheading have not been verified experimentally before, with the single exception of an AIXIjs demo [Aslanides, 2017].…”

Section: Methodsmentioning

confidence: 99%

Categorizing Wireheading in Partially Embedded Agents

Majha,

Sarkar,

Zagami

2019

Preprint

View full text Add to dashboard Cite

Embedded agents are not explicitly separated from their environment, lacking clear I/O channels. Such agents can reason about and modify their internal parts, which they are incentivized to shortcut or wirehead in order to achieve the maximal reward. In this paper, we provide a taxonomy of ways by which wireheading can occur, followed by a definition of wirehead-vulnerable agents. Starting from the fully dualistic universal agent AIXI, we introduce a spectrum of partially embedded agents and identify wireheading opportunities that such agents can exploit, experimentally demonstrating the results with the GRL simulation platform AIXIjs. We contextualize wireheading in the broader class of all misalignment problems -where the goals of the agent conflict with the goals of the human designer -and conjecture that the only other possible type of misalignment is specification gaming. Motivated by this taxonomy, we define wirehead-vulnerable agents as embedded agents that choose to behave differently from fully dualistic agents lacking access to their internal parts.1 An extensive list of examples in which various machine learning systems find ways to game the specified objective can be found at https://vkrakovna.wordpress.com/2018/04/02/specificationgaming-examples-in-ai/

show abstract

Section: Methodsmentioning

confidence: 99%

Categorizing Wireheading in Partially Embedded Agents

Majha,

Sarkar,

Zagami

2019

Preprint

View full text Add to dashboard Cite

show abstract

“…Kolmogorov complexity has also been considered in the context of reinforcement learning as a tool for complexityconstrained inference [9], [2], [15] based on Solomonoff's theory of inductive inference [19]. We differ by focusing instead on constraining the computational complexity of the obtained policy itself, assuming the underlying system to be known.…”

Section: B Contributionmentioning

confidence: 99%

“…To see this, let M be the Turing machine that takes a binary string p, inverts all zeros and ones, and outputs the result p. In particular, M (x * 1:T ) = x1:T . 9 Let in turn M be the Turing machine that given input p simulates the universal Turing machine U corresponding to φ, obtains the output U (p) and then feeds it as input to M . Then U (p) = x * 1:T implies M (p) = x1:T .…”

Section: Appendixmentioning

confidence: 99%

Computing Complexity-aware Plans Using Kolmogorov Complexity

Stefansson¹,

Johansson²

2021

Preprint

View full text Add to dashboard Cite

In this paper, we introduce complexity-aware planning for finite-horizon deterministic finite automata with rewards as outputs, based on Kolmogorov complexity. Kolmogorov complexity is considered since it can detect computational regularities of deterministic optimal policies. We present a planning objective yielding an explicit trade-off between a policy's performance and complexity. It is proven that maximising this objective is non-trivial in the sense that dynamic programming is infeasible. We present two algorithms obtaining low-complexity policies, where the first algorithm obtains a lowcomplexity optimal policy, and the second algorithm finds a policy maximising performance while maintaining local (stagewise) complexity constraints. We evaluate the algorithms on a simple navigation task for a mobile robot, where our algorithms yield low-complexity policies that concur with intuition.

show abstract

A Strongly Asymptotically Optimal Agent in General Environments

Cited by 2 publications

References 0 publications

Categorizing Wireheading in Partially Embedded Agents

Categorizing Wireheading in Partially Embedded Agents

Computing Complexity-aware Plans Using Kolmogorov Complexity

Contact Info

Product

Resources

About