Reaching an ExaScale computer by the end of the decade, and enabling the continued performance scaling of smaller systems requires significant research breakthroughs in three key areas: power efficiency, programmability, and execution granularity. To build an ExaScale machine in a power budget of 20MW requires a 200-fold improvement in energy per instruction: from 2nJ to 10pJ. Only 4x is expected from improved technology. The remaining 50x must come from improvements in architecture and circuits. To program a machine of this scale requires more productive parallel programming environments -that make parallel programming as easy as sequential programming is today. Finally, problem size and memory size constraints prevent the continued use of weak scaling, requiring these machines to extract parallelism at very fine granularity -down to the level of a few instructions. This talk will discuss these challenges and current approaches to address them.
Reaching an ExaScale computer by the end of the decade, and enabling the continued performance scaling of smaller systems requires significant research breakthroughs in three key areas: power efficiency, programmability, and execution granularity. To build an ExaScale machine in a power budget of 20 MW requires a 200-fold improvement in energy per instruction: from 2 nJ to 10 pJ. Only 4x is expected from improved technology. The remaining 50x must come from improvements in architecture and circuits. To program a machine of this scale requires more productive parallel programming environments-that make parallel programming as easy as sequential programming is today. Finally, problem size and memory size constraints prevent the continued use of weak scaling, requiring these machines to extract parallelism at very fine granularity-down to the level of a few instructions. This talk discusses these challenges and current approaches to address them. About the SpeakerBill is the Willard R. and Inez Kerr Bell Professor of Engineering at Stanford University and Chief Scientist at NVIDIA Corporation. Bill and his group have developed system architecture, network architecture, signaling, routing, and synchronization technology that can be found in most large parallel computers today. Bill has worked with Cray Research and Intel to incorporate his innovations in commercial parallel computers, with Avici Systems to incorporate his technology into Internet routers, co-founded Velio Communications to commercialize high-speed signaling technology, and co-founded Stream Processors, Inc. to commercialize stream processor technology. He is a Member of the National Academy of Engineering, a Fellow of the IEEE, a Fellow of the ACM, and a Fellow of the American Academy of Arts and Sciences. He has received numerous honors including the ACM Eckert-Mauchly Award, the IEEE Seymour Cray Award, and the ACM Maurice Wilkes award. He currently leads projects on computer architecture, network architecture, and programming systems. He has published over 200 papers in these areas, holds over 75 issued patents, and is an author of the textbooks, Digital Systems Engineering and Principles and Practices of Interconnection Networks.
Average precision (AP) is a widely used metric to evaluate detection accuracy of image and video object detectors.In this paper, we analyze object detection from videos and point out that AP alone is not sufficient to capture the temporal nature of video object detection. To tackle this problem, we propose a comprehensive metric, average delay (AD), to measure and compare detection delay. To facilitate delay evaluation, we carefully select a subset of ImageNet VID, which we name as ImageNet VIDT with an emphasis on complex trajectories. By extensively evaluating a wide range of detectors on VIDT, we show that most methods drastically increase the detection delay but still preserve AP well. In other words, AP is not sensitive enough to reflect the temporal characteristics of a video object detector. Our results suggest that video object detection methods should be additionally evaluated with a delay metric, particularly for latency-critical applications such as autonomous vehicle perception.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.