The current processes for building machine learning systems require practitioners with deep knowledge of machine learning. This significantly limits the number of machine learning systems that can be created and has led to a mismatch between the demand for machine learning systems and the ability for organizations to build them. We believe that in order to meet this growing demand for machine learning systems we must significantly increase the number of individuals that can teach machines. We postulate that we can achieve this goal by making the process of teaching machines easy, fast and above all, universally accessible.While machine learning focuses on creating new algorithms and improving the accuracy of "learners", the machine teaching discipline focuses on the efficacy of the "teachers". Machine teaching as a discipline is a paradigm shift that follows and extends principles of software engineering and programming languages. We put a strong emphasis on the teacher and the teacher's interaction with data, as well as crucial components such as techniques and design principles of interaction and visualization.In this paper, we present our position regarding the discipline of machine teaching and articulate fundamental machine teaching principles. We also describe how, by decoupling knowledge about machine learning algorithms from the process of teaching, we can accelerate innovation and empower millions of new uses for machine learning models.
Recent architectural trends have focused on increased parallelism via multicore processors and increased heterogeneity via accelerator devices (e.g., graphics-processing units, field-programmable gate arrays). Although these architectures have significant performance and energy potential, application designers face many device-specific challenges when choosing an appropriate accelerator or when customizing an algorithm for an accelerator. To help address this problem, in this article we thoroughly evaluate convolution, one of the most common operations in digital-signal processing, on multicores, graphics-processing units, and field-programmable gate arrays. Whereas many previous application studies evaluate a specific usage of an application, this article assists designers with design space exploration for numerous use cases by analyzing effects of different input sizes, different algorithms, and different devices, while also determining Pareto-optimal trade-offs between performance and energy.
This paper introduces Trill-a new query processor for analytics. Trill fulfills a combination of three requirements for a query processor to serve the diverse big data analytics space: (1) Query Model: Trill is based on a tempo-relational model that enables it to handle streaming and relational queries with early results, across the latency spectrum from real-time to offline; (2) Fabric and Language Integration: Trill is architected as a high-level language library that supports rich data-types and user libraries, and integrates well with existing distribution fabrics and applications; and (3) Performance: Trill's throughput is high across the latency spectrum. For streaming data, Trill's throughput is 2-4 orders of magnitude higher than comparable streaming engines. For offline relational queries, Trill's throughput is comparable to a major modern commercial columnar DBMS. Trill uses a streaming batched-columnar data representation with a new dynamic compilation-based system architecture that addresses all these requirements. In this paper, we describe Trill's new design and architecture, and report experimental results that demonstrate Trill's high performance across diverse analytics scenarios. We also describe how Trill's ability to support diverse analytics has resulted in its adoption across many usage scenarios at Microsoft.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.