Efficient Processing of Deep Neural Networks. Vivienne Sze

Efficient Processing of Deep Neural Networks

Скачать книгу

Binary Modification: Tools, Techniques, and Applications

Kim Hazelwood

2011

Quantum Computing for Computer Architects, Second Edition

Tzvetan S. Metodi, Arvin I. Faruque, and Frederic T. Chong

2011

High Performance Datacenter Networks: Architectures, Algorithms, and Opportunities

Dennis Abts and John Kim

2011

Processor Microarchitecture: An Implementation Perspective

Antonio González, Fernando Latorre, and Grigorios Magklis

2010

Transactional Memory, Second Edition

Tim Harris, James Larus, and Ravi Rajwar

2010

Computer Architecture Performance Evaluation Methods

Lieven Eeckhout

2010

Introduction to Reconfigurable Supercomputing

Marco Lanzagorta, Stephen Bique, and Robert Rosenberg

2009

On-Chip Networks

Natalie Enright Jerger and Li-Shiuan Peh

2009

The Memory System: You Can’t Avoid It, You Can’t Ignore It, You Can’t Fake It

Bruce Jacob

2009

Fault Tolerant Computer Architecture

Daniel J. Sorin

2009

The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines

Luiz André Barroso and Urs Hölzle

2009

Computer Architecture Techniques for Power-Efficiency

Stefanos Kaxiras and Margaret Martonosi

2008

Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency

Kunle Olukotun, Lance Hammond, and James Laudon

2007

Transactional Memory

James R. Larus and Ravi Rajwar

2006

Quantum Computing for Computer Architects

Tzvetan S. Metodi and Frederic T. Chong

2006

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in printed reviews, without the prior permission of the publisher.

Efficient Processing of Deep Neural Networks

Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, and Joel S. Emer

www.morganclaypool.com

ISBN: 9781681738314 paperback

ISBN: 9781681738321 ebook

ISBN: 9781681738338 hardcover

DOI 10.2200/S01004ED1V01Y202004CAC050

A Publication in the Morgan & Claypool Publishers series

SYNTHESIS LECTURES ON COMPUTER ARCHITECTURE

Lecture #50

Series Editors: Natalie Enright Jerger, University of Toronto

Margaret Martonosi, Princeton University

Founding Editor Emeritus: Mark D. Hill, University of Wisconsin, Madison

Series ISSN

Print 1935-3235 Electronic 1935-3243

For book updates, sign up for mailing list at

http://mailman.mit.edu/mailman/listinfo/eems-news

Efficient Processing of Deep Neural Networks

Vivienne Sze, Yu-Hsin Chen, and Tien-Ju Yang

Massachusetts Institute of Technology

Joel S. Emer

Massachusetts Institute of Technology and Nvidia Research

SYNTHESIS LECTURES ON COMPUTER ARCHITECTURE #50

ABSTRACT

This book provides a structured treatment of the key principles and techniques for enabling efficient processing of deep neural networks (DNNs). DNNs are currently widely used for many artificial intelligence (AI) applications, including computer vision, speech recognition, and robotics. While DNNs deliver state-of-the-art accuracy on many AI tasks, it comes at the cost of high computational complexity. Therefore, techniques that enable efficient processing of deep neural networks to improve key metrics—such as energy-efficiency, throughput, and latency—without sacrificing accuracy or increasing hardware costs are critical to enabling the wide deployment of DNNs in AI systems.

The book includes background on DNN processing; a description and taxonomy of hardware architectural approaches for designing DNN accelerators; key metrics for evaluating and comparing different designs; features of DNN processing that are amenable to hardware/algorithm co-design to improve energy efficiency and throughput; and opportunities for applying new technologies. Readers will find a structured introduction to the field as well as formalization and organization of key concepts from contemporary work that provide insights that may spark new ideas.

KEYWORDS

deep learning, neural network, deep neural networks (DNN), convolutional neural networks (CNN), artificial intelligence (AI), efficient processing, accelerator architecture, hardware/software co-design, hardware/algorithm co-design, domain-specific accelerators

Contents

Preface

Acknowledgments

PART I Understanding Deep Neural Networks

1 Introduction

1.1 Background on Deep Neural Networks

1.1.1 Artificial Intelligence and Deep Neural Networks

1.1.2 Neural Networks and Deep Neural Networks

1.2 Training versus Inference

1.3

Скачать книгу