Exploration by Random Network Distillation

Burda, Yuri; Edwards, Harrison; Storkey, Amos; Klimov, Oleg

doi:10.48550/arxiv.1810.12894

Cited by 207 publications

(369 citation statements)

References 25 publications

Supporting

Mentioning

344

Contrasting

Order By: Relevance

“…However, for learning tasks on graph-level data, no such general-purpose pretrained teacher networks are available; further, graph databases from different domains differ significantly from each other, which also prevents the application of this type of approach to the GAD task. Random knowledge distillation is originally introduced in [5] to address sparse reward problems in deep reinforcement learning (DRL). It uses the random distillation errors to measure the novelty of states as some additional reward signals to encourage DRL agents' exploration in sparse-reward contexts.…”

Section: Knowledge Distillationmentioning

confidence: 99%

“…The aim is to calculate the posterior after iteratively updating on the data. According to [5], our task can then be formulated as the optimization problem below:…”

Section: Theoretical Analysis Of Glocalkdmentioning

confidence: 99%

“…is equivalent to distilling a randomly drawn function from the prior. From this perspective, each entry of the representation outputs of the target and the predictor networks would correspond to a part of an ensemble and the prediction error would be an estimate of the predictive variance of the ensemble when the ensemble is assumed to be unbiased, as discussed in [5]. If we consider 𝜙 (•, Θ ★ ) as the target network with randomly initialized Θ ★ and regard 𝜙 (•, Θ) as the predictor network, the prediction errors of the node…”

Section: Theoretical Analysis Of Glocalkdmentioning

confidence: 99%

“…Adam is the default optimizer used in the above three methods. Both WL and PK are directly taken from the GraKeL library 0.1.8 5 . For WL, we perform three iterations to obtain the graph representations, which utilize the same neighborhood information as a three-layer GCN as in our model and OCGCN.…”

Section: Acknowledgmentsmentioning

confidence: 99%

See 3 more Smart Citations

Deep Graph-level Anomaly Detection by Glocal Knowledge Distillation

Pang

Chen

et al. 2022

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining

View full text Add to dashboard Cite

Graph-level anomaly detection (GAD) describes the problem of detecting graphs that are abnormal in their structure and/or the features of their nodes, as compared to other graphs. One of the challenges in GAD is to devise graph representations that enable the detection of both locallyand globally-anomalous graphs, i.e., graphs that are abnormal in their fine-grained (node-level) or holistic (graph-level) properties, respectively. To tackle this challenge we introduce a novel deep anomaly detection approach for GAD that learns rich global and local normal pattern information by joint random distillation of graph and node representations. The random distillation is achieved by training one GNN to predict another GNN with randomly initialized network weights. Extensive experiments on 16 real-world graph datasets from diverse domains show that our model significantly outperforms seven state-of-the-art models. Code and datasets are available at https://git.io/GLocalKD.

show abstract

Section: Knowledge Distillationmentioning

confidence: 99%

“…The aim is to calculate the posterior after iteratively updating on the data. According to [5], our task can then be formulated as the optimization problem below:…”

Section: Theoretical Analysis Of Glocalkdmentioning

confidence: 99%

Section: Theoretical Analysis Of Glocalkdmentioning

confidence: 99%

Section: Acknowledgmentsmentioning

confidence: 99%

See 2 more Smart Citations

Deep Graph-level Anomaly Detection by Glocal Knowledge Distillation

Pang

Chen

et al. 2022

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining

View full text Add to dashboard Cite

show abstract

“…Existing methods use curiosity or uncertainty as a signal for exploration [Pathak et al, 2017;Burda et al, 2018] so that the learned agent is able to cover a large state space. However, the exploration-exploitation dilemma, given the sample efficiency consideration, drives us to develop self-imitation learning (SIL) [Oh et al, 2018] methods that focus on exploiting past good experiences for better exploration.…”

Section: Related Workmentioning

confidence: 99%

JueWu-MC: Playing Minecraft with Sample-efficient Hierarchical Reinforcement Learning

Lin¹,

Li²,

Shi³

et al. 2021

Preprint

View full text Add to dashboard Cite

Learning rational behaviors in open-world games like Minecraft remains to be challenging for Reinforcement Learning (RL) research due to the compound challenge of partial observability, highdimensional visual perception and delayed reward.To address this, we propose JueWu-MC, a sampleefficient hierarchical RL approach equipped with representation learning and imitation learning to deal with perception and exploration. Specifically, our approach includes two levels of hierarchy, where the high-level controller learns a policy to control over options and the low-level workers learn to solve each sub-task. To boost the learning of sub-tasks, we propose a combination of techniques including 1) action-aware representation learning which captures underlying relations between action and representation, 2) discriminator-based self-imitation learning for efficient exploration, and 3) ensemble behavior cloning with consistency filtering for policy robustness. Extensive experiments show that JueWu-MC significantly improves sample efficiency and outperforms a set of baselines by a large margin. Notably, we won the championship of the NeurIPS MineRL 2021 research competition and achieved the highest performance score ever.

show abstract

Curiosity and Interactive Learning in Artificial Systems

Haber

2022

AI in Learning: Designing the Future

View full text Add to dashboard Cite

As “scientists in the crib,” children learn through curiosity, tirelessly seeking novelty and information as they interact—really, play—with both physical objects and the people around them. This flexible capacity to learn about the world through intrinsically motivated interaction continues throughout life. How would we engineer an artificial, autonomous agent that learns in this way – one that flexibly interacts with its environment, and others within it, in order to learn as humans do? In this chapter, I will first motivate this question by describing important advances in artificial intelligence in the last decade, noting ways in which artificial learning within these methods are and are not like human learning. I will then give an overview of recent results in artificial intelligence aimed at replicating curiosity-driven interactive learning. I will then close by speculating on how AI that learns in this fashion could be used as fine-grained computational models of human learning.

show abstract

Exploration by Random Network Distillation

Cited by 207 publications

References 25 publications

Deep Graph-level Anomaly Detection by Glocal Knowledge Distillation

Deep Graph-level Anomaly Detection by Glocal Knowledge Distillation

JueWu-MC: Playing Minecraft with Sample-efficient Hierarchical Reinforcement Learning

Curiosity and Interactive Learning in Artificial Systems

Contact Info

Product

Resources

About