内容简介
本书由2017年图灵奖得主Patterson和Hennessy共同撰写,是计算机体系结构领域的经典教材,强调软硬件协同设计及其对能的影响。本书采用MIPS体系结构,在介绍并行、流水线、存储层次、抽象等基本原理的基础上,新增关于领域专用体系结构(DSA)的讨论,关注攻击、开放指令集、开源软硬件和再虚拟化等新趋势和新问题,并更新了所有实例和练别是新增了GoogleTPU实例。
目录
PrefaceCHAPTERS 1 Computer Abstractions and Technology 1.1 Introduction 1.2 Seven Great Ideas in Computer Architecture 1.3 Below Your Program 1.4 Under the Coverr/> 1.5 Technologies for Building Processors and Memory 1.6 Performance 1.7 Th e Power Wall 1.8 Th e Sea Change: Th e Switch from Uniprocessors to Multiprocessorr/> 1.9 Real Stuff : Benchma the Intel Core i 1.10 Going Faster: Matrix Multiply in Python 1.11 Fallacies and Pitfallr/> 1.12 Concluding Remarkr/> 1.13 Historical Perspective and Further Reading 1.14 Self-Study 1.15 Exerciser/> 2 Instructions: Language of the Computer 2.1 Introduction 2.2 Operations of the Computer Hardware 2.3 Operands of the Computer Hardware 2.4 Signed and Unsigned Numberr/> 2.5 Representing Instructions in the Computer 2.6 Logical Operationr/> 2.7 Instructions for M Decisionr/> 2.8 Supporting Procedures in Computer Hardware 2.9 Communicating with People 2.10 MIPS ressing for 32-Bit Immediates and resser/> 2.11 Parallelism and Instructions: Synchronization 2.12 Translating and Starting a Program 2.13 A C Sort Example to Put It All Together 2.14 Arrays versus Pointerr/> 2.15 Advanced Material: Compiling C and Interpreting Java 2.16 Real Stuff : ARMv7 (32-bit) Instructionr/> 2.17 Real Stuff : ARMv8 (64-bit) Instructionr/> 2.18 Real Stuff : RISC-V Instructionr/> 2.19 Real Stuff : x86 Instructionr/> 2.20 Going Faster: Matrix Multiply in C 2.21 Fallacies and Pitfallr/> 2.22 Concluding Remarkr/> 2.23 Historical Perspective and Further Reading 2.24 Self Study 2.25 Exerciser/> 3 Arithmetic for Computerr/> 3.1 Introduction 3.2 ition and Subtraction 3.3 Multiplication 3.4 Division 3.5 Floating Point 3.6 Parallelism and Computer Arithmetic: Subword Parallelir/> 3.7 Real Stuff : Streaming SIMD Extensions and Advanced Vector Extensions in x 3.8 Going Faster: Subword Parallelism and Matrix Multiply 3.9 Fallacies and Pitfallr/> 3.10 Concluding Remarkr/> 3.11 Historical Perspective and Further Reading 3.12 Self Study 3.13 Exerciser/> 4 The Processor 4.1 Introduction 4.2 Logic Design Conventionr/> 4.3 Building a Datapath 4.4 A Simple Implementation Scheme 4.5 A Multicycle Implementation 4.6 An Overview of Pipelining 4.7 Pipelined Datapath and Control 4.8 Data Hazards: Forwarding versus Stalling 4.9 Control Hazardr/> 4.10 Exceptionr/> 4.11 Parallelism via Instructionr/> 4.12 Putting It All Together: Th e Intel Core i7 6700 and ARM Cortex 4.13 Going Faster: Instruction-Level Parallelism and Matrix Multiply 4.14 Advanced Topic: An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrationr/> 4.15 Fallacies and Pitfallr/> 4.16 Concluding Remarkr/> 4.17 Historical Perspective and Further Reading 4.18 Self-Study 4.19 Exerciser/> 5 Large and Fast: Exploiting Memory Hierarchy 5.1 Introduction 5.2 Memory Technologier/> 5.3 Th e Basics of Cacher/> 5.4 Measuring and Improving Cache Performance 5.5 Dependable Memory Hierarchy 5.6 Virtual Machiner/> 5.7 Virtual Memory 5.8 A Common framework for Memory Hierarchy 5.9 Using a Finite-State Machine to Control a Simple Cache 5.10 Parallelism and Memory Hierarchies: Cache Coherence 5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Diskr/> 5.12 Advanced Material: Implementing Cache Controllerr/> 5.13 Real Stuff: The ARM Cortex-A8 and Intel Core i7 Memory 5.14 Going Faster: Cache Blo and Matrix Multiply 5.15 Fallacies and Pitfallr/> 5.16 Concluding Remarkr/> 5.17 Historical Perspective and Further Reading 5.18 Self-Study 5.19 Exerciser/> 6 Parallel Processors from Client to Cloud 6.1 Introduction 6.2 The Difficulty of Creating Parallel Processing Programr/> 6.3 SISD, MIMD, SIMD, SPMD, and Vector 6.4 Hardware Multithreading 6.5 Mult and Other Shared Memory Multiprocessorr/> 6.6 Introduction to Graphics Processing Unitr/> 6.7 Domain Specific Architecturer/> 6.8 Clusters, Warehouse Scale Computers, and Other Message Passing Multiprocessorr/> 6.9 Introduction to Multiprocessor Network Topologier/> 6.10 Communicating to the Outside World: Cluster Netwo 6.11 Multiprocessor Bench marks and Performance Modelr/> 6.12 Real Stuff: Benchma the Google TPUv3 Supercomputer and an NVIDIA Volts CDII Clweta c 6.13 Going Faster: Multiple Processors and Matrix Multiply 6.14 Fallacies and Pitfallr/> 6.15 Concluding Remarkr/> 6.16 Historical Perspective and Further Reading 6.17 Self Study 6.18 Exerciser/>APPENDICES A Aslers, linkers, and the SPIM Simulator A.1 Introduction A.2 Aslerr/> A.3 linkerr/> A.4 Loading A.5 Memory Usage A.6 Procedure Call Convention A.7 Exceptions and Interruptr/> A.8 Input and Output A.9 SPIM A.10 MIPS R2000 Asly Language A.11 Concluding Remarkr/> A.12 Exerciser/> B The Basics of Logic Design B.1 Introduction B.2 Gates, Truth Tables, and Logic Equationr/> B.3 Combinational Logic B.4 Using a Hardware Description Language B.5 Constructing a Basic Arithmetic Logic Unit B.6 Faster ition: Carry Lookahead B.7 Clockr/> B.8 Memory Elements: Flip-Flops, Latches, and Registerr/> B.9 Memory Elements: SRAMs and DRAMr/> B.10 Finite-State Machiner/> B.11 Timing Methodologies B B.12 Field Programmable Devices B B.13 Concluding Remarks B B.14 Exercises B Index onLINE ConTENT Graphics and Computing GPUr/> C.1 Introduction C.2 GPU System Architecturer/> C.3 Programming GPUr/> C.4 Multithreaded Multiprocessor Architecture C.5 Parallel Memory System C.6 Floating Point Arithmetic C.7 Real Stuff: The NVIDIA GeForce 8800 C.8 Real Stuff: Mapping Applications to GPUr/> C.9 Fallacies and Pitfallr/> C.10 Concluding Remarkr/> C.11 Historical Perspective and Further Reading Mapping Control to Hardware D.1 Introduction D.2 Implementing Combinational Control Unitr/> D.3 Implementing Finite-State Machine Control D.4 Implementing the Next-State Function with a Sequencer D.5 Translating a Microprogram to Hardware D.6 Concluding Remarkr/> D.7 Exerciser/> Survey of Instruction Set Architecturer/> E.1 Introduction E.2 A Survey of RISC Architecture for Desktop, Server, and Embed Computerr/> E.3 The Intel 80x86 E.4 The VAX Architecture E.5 The IBM 360/370 Architecture for Mainframe Computerr/> E.6 Historical Perspective and Referencer/> IndexonLINE ConTENT Graphics and Computing GPUr/> C.1 Introduction C.2 GPU System Architecturer/> C.3 Programming GPUr/> C.4 Multithreaded Multiprocessor Architecture C.5 Parallel Memory System C.6 Floating Point Arithmetic C.7 Real Stuff: The NVIDIA GeForce 8800 C.8 Real Stuff: Mapping Applications to GPUr/> C.9 Fallacies and Pitfallr/> C.10 Concluding Remarkr/> C.11 Historical Perspective and Further Reading Mapping Control to Hardware D.1 Introduction D.2 Implementing Combinational Control Unitr/> D.3 Implementing Finite-State Machine Control D.4 Implementing the Next-State Function with a Sequencer D.5 Translating a Microprogram to Hardware D.6 Concluding Remarkr/> D.7 Exerciser/> Survey of Instruction Set Architecturer/> E.1 Introduction E.2 A Survey of RISC Architecture for Desktop, Server, and Embed Computerr/> E.3 The Intel 80x86 E.4 The VAX Architecture E.5 The IBM 360/370 Architecture for Mainframe Computerr/> E.6 Historical Perspective and Referencer/>GlossaryFurther Reading