Rohin Shah scite author profile

Rohin Shah

5Publications

26Citation Statements Received

89Citation Statements Given

How they've been cited

How they cite others

108

Affiliations

DeepMind (United Kingdom), University of California, Berkeley, University of Maryland, College Park

Publications

Order By: Most citations

Chlorophyll

Phothilimthana

Jelvis

Shah

et al. 2014

View full text Add to dashboard Cite

We developed Chlorophyll, a synthesis-aided programming model and compiler for the GreenArrays GA144, an extremely minimalist low-power spatial architecture that requires partitioning the program into fragments of no more than 256 instructions and 64 words of data. This processor is 100-times more energy efficient than its competitors, but currently can only be programmed using a lowlevel stack-based language.The Chlorophyll programming model allows programmers to provide human insight by specifying partial partitioning of data and computation. The Chlorophyll compiler relies on synthesis, sidestepping the need to develop classical optimizations, which may be challenging given the unusual architecture. To scale synthesis to real problems, we decompose the compilation into smaller synthesis subproblems-partitioning, layout, and code generation. We show that the synthesized programs are no more than 65% slower than highly optimized expert-written programs and are faster than programs produced by a heuristic, non-synthesizing version of our compiler.

show abstract

The MineRL BASALT Competition on Learning from Human Feedback

Shah¹,

Wild²,

Wang³

et al. 2021

Preprint

View full text Add to dashboard Cite

The last decade has seen a significant increase of interest in deep learning research, with many public successes that have demonstrated its potential. As such, these systems are now being incorporated into commercial products. With this comes an additional challenge: how can we build AI systems that solve tasks where there is not a crisp, well-defined specification? While multiple solutions have been proposed, in this competition we focus on one in particular: learning from human feedback. Rather than training AI systems using a predefined reward function or using a labeled dataset with a predefined set of categories, we instead train the AI system using a learning signal derived from some form of human feedback, which can evolve over time as the understanding of the task changes, or as the capabilities of the AI system improve.The MineRL BASALT competition aims to spur forward research on this important class of techniques. We design a suite of four tasks in Minecraft for which we expect it will be hard to write down hardcoded reward functions. These tasks are defined by a paragraph of natural language: for example, "create a waterfall and take a scenic picture of it", with additional clarifying details. Participants must train a separate agent for each task, using any method they want. Agents are then evaluated by humans who have read the task description. To help participants get started, we provide a dataset of human demonstrations on each of the four tasks, as well as an imitation learning baseline that leverages these demonstrations.

show abstract

Preferences Implicit in the State of the World

Shah¹,

Dmitrii²,

Alexander³

et al. 2019

Preprint

View full text Add to dashboard Cite

Reinforcement learning (RL) agents optimize only the features specified in a reward function and are indifferent to anything left out inadvertently. This means that we must not only specify what to do, but also the much larger space of what not to do. It is easy to forget these preferences, since these preferences are already satisfied in our environment. This motivates our key insight: when a robot is deployed in an environment that humans act in, the state of the environment is already optimized for what humans want. We can therefore use this implicit preference information from the state to fill in the blanks. We develop an algorithm based on Maximum Causal Entropy IRL and use it to evaluate the idea in a suite of proof-of-concept environments designed to show its properties. We find that information from the initial state can be used to infer both side effects that should be avoided as well as preferences for how the environment should be organized. Our code can be found at https://github.com/HumanCompatibleAI/rlsp.

show abstract

The MAGICAL Benchmark for Robust Imitation

Toyer¹,

Shah²,

Critch³

et al. 2020

Preprint

View full text Add to dashboard Cite

Chlorophyll

et al. 2014

View full text Add to dashboard Cite

We developed Chlorophyll, a synthesis-aided programming model and compiler for the GreenArrays GA144, an extremely minimalist low-power spatial architecture that requires partitioning the program into fragments of no more than 256 instructions and 64 words of data. This processor is 100-times more energy efficient than its competitors, but currently can only be programmed using a low-level stack-based language. The Chlorophyll programming model allows programmers to provide human insight by specifying partial partitioning of data and computation. The Chlorophyll compiler relies on synthesis, sidestepping the need to develop classical optimizations, which may be challenging given the unusual architecture. To scale synthesis to real problems, we decompose the compilation into smaller synthesis subproblems---partitioning, layout, and code generation. We show that the synthesized programs are no more than 65% slower than highly optimized expert-written programs and are faster than programs produced by a heuristic, non-synthesizing version of our compiler.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Rohin Shah

Chlorophyll

The MineRL BASALT Competition on Learning from Human Feedback

Preferences Implicit in the State of the World

The MAGICAL Benchmark for Robust Imitation

Chlorophyll

Contact Info

Product

Resources

About