Derek G. Murray scite author profile

Virtual machine monitors (VMMs) have been hailed as the basis for an increasing number of reliable or trusted computing systems. The Xen VMM is a relatively small piece of software -a hypervisor -that runs at a lower level than a conventional operating system in order to provide isolation between virtual machines: its size is offered as an argument for its trustworthiness. However, the management of a Xen-based system requires a privileged, fullblown operating system to be included in the trusted computing base (TCB).In this paper, we introduce our work to disaggregate the management virtual machine in a Xen-based system. We begin by analysing the Xen architecture and explaining why the status quo results in a large TCB. We then describe our implementation, which moves the domain builder, the most important privileged component, into a minimal trusted compartment. We illustrate how this approach may be used to implement "trusted virtualisation" and improve the security of virtual TPM implementations. Finally, we evaluate our approach in terms of the reduction in TCB size, and by performing a security analysis of the disaggregated system.

show abstract

Dynamic control flow in large-scale machine learning

Yuan

Abadi

Barham

et al. 2018

View full text Add to dashboard Cite

Many recent machine learning models rely on fine-grained dynamic control flow for training and inference. In particular, models based on recurrent neural networks and on reinforcement learning depend on recurrence relations, data-dependent conditional execution, and other features that call for dynamic control flow. These applications benefit from the ability to make rapid control-flow decisions across a set of computing devices in a distributed system. For performance, scalability, and expressiveness, a machine learning system must support dynamic control flow in distributed and heterogeneous environments. This paper presents a programming model for distributed machine learning that supports dynamic control flow. We describe the design of the programming model, and its implementation in TensorFlow, a distributed machine learning system. Our approach extends the use of dataflow graphs to represent machine learning models, offering several distinctive features. First, the branches of conditionals and bodies of loops can be partitioned across many machines to run on a set of heterogeneous devices, including CPUs, GPUs, and custom ASICs. Second, programs written in our model support automatic differentiation and distributed gradient computations, which are necessary for training machine learning models * Work done primarily at Google Brain. that use control flow. Third, our choice of non-strict semantics enables multiple loop iterations to execute in parallel across machines, and to overlap compute and I/O operations.We have done our work in the context of TensorFlow, and it has been used extensively in research and production. We evaluate it using several real-world applications, and demonstrate its performance and scalability.

show abstract

A computational model for TensorFlow: an introduction

Abadi

Isard

Murray

2017

View full text Add to dashboard Cite

TensorFlow is a powerful, programmable system for machine learning. This paper aims to provide the basics of a conceptual framework for understanding the behavior of TensorFlow models during training and inference: it describes an operational semantics, of the kind common in the literature on programming languages. More broadly, the paper suggests that a programming-language perspective is fruitful in designing and in explaining systems such as TensorFlow. CCS Concepts • Theory of computation → Operational semantics; • Computing methodologies → Neural networks; • Software and its engineering → Data flow architectures

show abstract

The case for crowd computing

Murray

Yoneki

Crowcroft

et al. 2010

View full text Add to dashboard Cite

Naiad

et al. 2013

View full text Add to dashboard Cite

Naiad is a distributed system for executing data parallel, cyclic dataflow programs. It offers the high throughput of batch processors, the low latency of stream processors, and the ability to perform iterative and incremental computations. Although existing systems offer some of these features, applications that require all three have relied on multiple platforms, at the expense of efficiency, maintainability, and simplicity. Naiad resolves the complexities of combining these features in one framework.A new computational model, timely dataflow, underlies Naiad and captures opportunities for parallelism across a wide class of algorithms. This model enriches dataflow computation with timestamps that represent logical points in the computation and provide the basis for an efficient, lightweight coordination mechanism.We show that many powerful high-level programming models can be built on Naiad's low-level primitives, enabling such diverse tasks as streaming data analysis, iterative machine learning, and interactive graph mining. Naiad outperforms specialized systems in their target application domains, and its unique features enable the development of new high-performance applications.

show abstract

Incremental, iterative data processing with timely dataflow

Murray

McSherry²,

Isard

et al. 2016

Commun. ACM

View full text Add to dashboard Cite

We describe the timely dataflow model for distributed computation and its implementation in the Naiad system. The model supports stateful iterative and incremental computations. It enables both low-latency stream processing and high-throughput batch processing, using a new approach to coordination that combines asynchronous and fine-grained synchronous execution. We describe two of the programming frameworks built on Naiad: GraphLINQ for parallel graph processing, and differential dataflow for nested iterative and incremental computations. We show that a generalpurpose system can achieve performance that matches, and sometimes exceeds, that of specialized systems.

show abstract

Formal Analysis of a Distributed Algorithm for Tracking Progress

Abadi

McSherry

Murray

et al. 2013

View full text Add to dashboard Cite

Abstract. Tracking the progress of computations can be both important and delicate in distributed systems. In a recent distributed algorithm for this purpose, each processor maintains a delayed view of the pending work, which is represented in terms of points in virtual time. This paper presents a formal specification of that algorithm in the temporal logic TLA, and describes a mechanically verified correctness proof of its main properties.

show abstract

Privilege separation made easy

Murray

Hand

2008

View full text Add to dashboard Cite

At the heart of a secure software system is a small, trustworthy component, called the Trusted Computing Base (TCB). However, developers persist in building monolithic systems that force their users to trust the entire system. We posit that this is due to the lack of a straightforward mechanism for partitioning -or disaggregating -systems into trusted and untrusted components. We propose to use the dynamic library as the unit of disaggregation, because it is a familiar abstraction, which is commonly used in mainstream software development.In this paper, we present our early ideas on the disaggregated library approach, which can be applied to existing applications that run on commodity operating systems. We first make the case for a new approach to disaggregation, and then describe how we are implementing it. We also draw comparisons with the wide range of related work in this area.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Derek G. Murray

Improving Xen security through disaggregation

Dynamic control flow in large-scale machine learning

A computational model for TensorFlow: an introduction

The case for crowd computing

Naiad

Incremental, iterative data processing with timely dataflow

Formal Analysis of a Distributed Algorithm for Tracking Progress

Privilege separation made easy

Contact Info

Product

Resources

About