EE230 Computer Architecture
Table of contents
- EE230A Digital Design and Computer Architecture RISC-V Edition
- Chapter 1 From Zero to One
- Chapter 2 Combinational Logic Design
- Chapter 3 Sequential Logic Design
- Chapter 4 Hardware Description Languages
- Chapter 5 Digital Building Blocks
- Chapter 6 Architecture
- Chapter 7 Microarchitecture
- Chapter 8 Memory Systems
- Chapter 9 Embedded I/O Systems
- Appendix A Digital System Implementation
- Appendix B RISC-V Instruction Set Summary
- Appendix C C Programming
- EE230B Computer Organization and Design - RISC-V Edition Hardware Software Interface
- Chapter 1 Computer Abstractions and Technology
- Chapter 2 Instructions: Language of the Computer
- Chapter 3 Arithmetic for Computers
- Chapter 4 The Processor
- Chapter 5 Large and Fast: Exploiting Memory Hierarchy
- Chapter 6 Parallel Processors from Client to Cloud
- A The Basics of Logic Design
- B Graphics and Computing GPUs
- C Mapping Control to Hardware
- D Survey of Instruction Set Architectures
- EE230C Computer Architecture - A Quantitive Approach
- Chapter 1 Fundamentals of Quantitative Design and Analysis
- Chapter 2 Memory Hierarchy Design
- Chapter 3 Instruction-Level Parallelism and Its Exploitation
- Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures
- Chapter 5 read-Level Parallelism
- Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism
- Chapter 7 Domain-Specific Architectures
- 7.12 Historical Perspectives and References 606 Case Studies and Exercises by Cliff Young
- Appendix B Review of Memory Hierarchy
- Appendix C Pipelining: Basic and Intermediate Concepts
EE230A Digital Design and Computer Architecture RISC-V Edition
By Harris & Harris, 2022 Edition Index
Chapter 1 From Zero to One
1.2 The Art of Managing Complexity
- 1.4.1 Decimal Numbers
- 1.4.2 Binary Numbers
- 1.4.3 Hexadecimal Numbers
- 1.4.4 Bytes, Nibbles, and All That Jazz
- 1.4.5 Binary Addition
- 1.4.6 Signed Binary Numbers .
- 1.5.1 NOT Gate
- 1.5.2 Buffer
- 1.5.3 AND Gate
- 1.5.4 OR Gate
- 1.5.5 Other Two-Input Gates
- 1.5.6 Multiple-Input Gates
1.6 Beneath the Digital Abstraction
- 1.6.1 Supply Voltage
- 1.6.2 Logic Levels
- 1.6.3 Noise Margins
- 1.6.4 DC Transfer Characteristics
- 1.6.5 The Static Discipline
- 1.7.1 Semiconductors
- 1.7.2 Diodes
- 1.7.3 Capacitors
- 1.7.4 nMOS and pMOS Transistors
- 1.7.5 CMOS NOT Gate
- 1.7.6 Other CMOS Logic Gates
- 1.7.7 Transmission Gates
- 1.7.8 Pseudo-nMOS Logic
Chapter 2 Combinational Logic Design
- 2.3.1 Axioms
- 2.3.2 Theorems of One Variable
- 2.3.3 Theorems of Several Variables
- 2.3.4 The Truth Behind It All
- 2.3.5 Simplifying Equations
2.5 Multilevel Combinational Logic
2.8 Combinational Building Blocks
Chapter 3 Sequential Logic Design
- 3.3.1 Some Problematic Circuits
- 3.3.2 Synchronous Sequential Circuits
- 3.3.3 Synchronous and Asynchronous Circuits
- 3.4.1 FSM Design Example
- 3.4.2 State Encodings
- 3.4.3 Moore and Mealy Machines
- 3.4.4 Factoring State Machine
- 3.4.5 Deriving an FSM from a Schematic
- 3.4.6 FSM Review
3.5 Timing of Sequential Logic
- 3.5.1 The Dynamic Discipline
- 3.5.2 System Timing
- 3.5.3 Clock Skew
- 3.5.4 Metastability
- 3.5.5 Synchronizers
- 3.5.6 Derivation of Resolution Time .
Chapter 4 Hardware Description Languages
- 4.2.3 Reduction Operators
- 4.2.4 Conditional Assignment
- 4.2.5 Internal Variables
- 4.2.6 Precedence
- 4.2.8 Z’s and X’s
- 4.2.9 Bit Swizzling
- 4.2.10 Delays
Chapter 5 Digital Building Blocks
5.4 Sequential Building Blocks
- 5.5.2 Dynamic Random Access Memory (DRAM)
- 5.5.5 Register Files
- 5.5.6 Read Only Memory (ROM)
- 5.5.7 Logic Using Memory Arrays .
- 5.5.8 Memory HDL
- 5.6.1 Programmable Logic Array (PLA)
- 5.6.2 Field Programmable Gate Array (FPGA)
- 5.6.3 Array Implementations
Chapter 6 Architecture
- 6.3.1 Program Flow
- 6.3.2 Logical, Shift, and Multiply Instructions
- 6.3.4 Conditional Statements
- 6.3.5 Getting Loopy .
- 6.3.6 Arrays
- 6.3.7 Function Calls
- 6.3.8 Pseudoinstructions
- 6.4.1 R-Type Instructions
- 6.4.2 l-Type Instructions
- 6.4.3 S/B-Type Instructions
- 6.4.4 U/J-Type Instructions .
- 6.4.5 Immediate Encodings
- 6.4.6 Addressing Modes
- 6.4.7 Interpreting Machine Language Cod
- 6.4.8 The Power of the Stored Program
6.5 Lights, Camera, Action: Compiling, Assembling,and Loading
Chapter 7 Microarchitecture
- 7.3.1 Sample Program
- 7.3.2 Single-Cycle Datapath
- 7.3.3 Single-Cycle Control .
- 7.3.4 More Instructions
- 7.3.5 Performance Analysi
7.7 Advanced Microarchitecture
- 7.7.1 Deep Pipelines
- 7.7.2 Micro-Operations
- 7.7.3 Branch Prediction
- 7.7.4 Superscalar Processors
- 7.7.5 Out-of-Order Processor
- 7.7.6 Register Renaming
- 7.7.7 Multithreading
- 7.7.8 Multiprocessors
7.8 Real-World Perspective: Evolution of RISC-V Microarchitecture
Chapter 8 Memory Systems
8.2 Memory System Performance Analysis
- 8.3.1 What Data is Held in the Cache?
- 8.3.2 How is Data Found?
- 8.3.3 What Data is Replaced?
- 8.3.4 Advanced Cache Design
- 8.4.1 Address Translation
- 8.4.2 The Page Table
- 8.4.3 The Translation Lookaside Buffer
- 8.4.4 Memory Protection
- 8.4.5 Replacement Policies
- 8.4.6 Multilevel Page Tables
Chapter 9 Embedded I/O Systems
- 9.3.1 RED-V Board
- 9.3.2 FE310-G002 System-on-Chip
- 9.3.3 General-Purpose Digital I/O
- 9.3.4 Device Drivers
- 9.3.5 Serial I/O
- 9.3.6 Timers
- 9.3.7 Analog I/O
- 9.3.8 Interrupts
9.4 Other Microcontroller Peripherals
Appendix A Digital System Implementation
A.4 Application-Specific Integrated Circuits
A.7 Switches and Light-Emitting Diodes
- A.9.1 Matched Termination
- A.9.2 Open Termination
- A.9.3 Short Termination
- A.9.4 Mismatched Termination
- A.9.5 When to Use Transmission Line Models
- A.9.6 Proper Transmission Line Terminations
- A.9.7 Derivation of Z0
- A.9.8 Derivation of the Reflection Coefficient
- A.9.9 Putting It All Together
Appendix B RISC-V Instruction Set Summary
Appendix C C Programming
- C.8.1 Pointers
- C.8.2 Arrays
- C.8.3 Characters
- C.8.4 Strings
- C.8.5 Structures
- C.8.6 typedef
- C.8.7 Dynamic Memory Allocation
- C.8.8 Linked Lists
C.10 Compiler and Command Line Options
EE230B Computer Organization and Design - RISC-V Edition Hardware Software Interface
By David A. Patterson John L. Hennessy, 2021 Edition Index
Chapter 1 Computer Abstractions and Technology
1.2 Seven Great Ideas in Computer Architecture
1.5 Technologies for Building Processors and Memory
1.8 The Sea Change: The Switch from Uniprocessors to Multiprocessors
1.9 Real Stuff: Benchmarking the Intel Core i7
1.10 Going Faster: Matrix Multiply in Python
1.13 Historical Perspective and Further Reading
Chapter 2 Instructions: Language of the Computer
2.2 Operations of the Computer Hardware
2.3 Operands of the Computer Hardware
2.4 Signed and Unsigned Numbers
2.5 Representing Instructions in the Computer
2.7 Instructions for Making Decisions
2.8 Supporting Procedures in Computer Hardware
2.10 RISC-V Addressing for Wide Immediates and Addresses
2.11 Parallelism and Instructions: Synchronization
2.12 Translating and Starting a Program
2.13 A C Sort Example to Put it All Together
2.15 Advanced Material: Compiling C and Interpreting Java
2.16 Real Stuff: MIPS Instructions
2.17 Real Stuff: ARMv7 (32-bit) Instructions
2.18 Real Stuff: ARMv8 (64-bit) Instructions
2.19 Real Stuff: x86 Instructions
2.20 Real Stuff: The Rest of the RISC-V Instruction Set
2.21 Going Faster: Matrix Multiply in C
2.24 Historical Perspective and Further Reading
Chapter 3 Arithmetic for Computers
3.6 Parallelism and Computer Arithmetic: Subword Parallelism
3.7 Real Stuff: Streaming SIMD Extensions and Advanced Vector Extensions in x86
3.8 Going Faster: Subword Parallelism and Matrix Multiply
3.11 Historical Perspective and Further Reading
Chapter 4 The Processor
4.4 A Simple Implementation Scheme
4.7 Pipelined Datapath and Control
4.8 Data Hazards: Forwarding versus Stalling
4.11 Parallelism via Instructions
4.12 Putting it All Together: The Intel Core i7 6700 and ARM Cortex-A53
4.13 Going Faster: Instruction-Level Parallelism and Matrix Multiply
4.17 Historical Perspective and Further Reading
Chapter 5 Large and Fast: Exploiting Memory Hierarchy
5.4 Measuring and Improving Cache Performance
5.5 Dependable Memory Hierarchy
5.8 A Common Framework for Memory Hierarchy
5.9 Using a Finite-State Machine to Control a Simple Cache
5.10 Parallelism and Memory Hierarchy: Cache Coherence
5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks
5.12 Advanced Material: Implementing Cache Controllers
5.13 Real Stuff: The ARM Cortex-A8 and Intel Core i7 Memory Hierarchies
5.14 Real Stuff: The Rest of the RISC-V System and Special Instructions
5.15 Going Faster: Cache Blocking and Matrix Multiply
5.18 Historical Perspective and Further Reading
Chapter 6 Parallel Processors from Client to Cloud
6.2 The Difficulty of Creating Parallel Processing Programs
6.3 SISD, MIMD, SIMD, SPMD, and Vector
6.5 Multicore and Other Shared Memory Multiprocessors
6.6 Introduction to Graphics Processing Units
6.7 Domain-Specific Architectures
6.8 Clusters, Warehouse Scale Computers, and Other Message-Passing Multiprocessors
6.9 Introduction to Multiprocessor Network Topologies
6.10 Communicating to the Outside World: Cluster Networking
6.11 Multiprocessor Benchmarks and Performance Models
6.12 Real Stuff: Benchmarking the Google TPUv3 Supercomputer and an NVIDIA Volta GPU Cluster
6.13 Going Faster: Multiple Processors and Matrix Multiply
6.16 Historical Perspective and Further Reading
A The Basics of Logic Design
A.2 Gates, Truth Tables, and Logic Equations
A.4 Using a Hardware Description Language
A.5 Constructing a Basic Arithmetic Logic Unit
A.6 Faster Addition: Carry Lookahead
A.8 Memory Elements: Flip-Flops, Latches, and Registers
A.9 Memory Elements: SRAMs and DRAMs
A.12 Field Programmable Devices
B Graphics and Computing GPUs
B.4 Multithreaded Multiprocessor Architecture
B.7 Real Stuff: The NVIDIA GeForce 8800
B.8 Real Stuff: Mapping Applications to GPUs
B.11 Historical Perspective and Further Reading
C Mapping Control to Hardware
C.2 Implementing Combinational Control Units
C.3 Implementing Finite-State Machine Control
C.4 Implementing the Next-State Function with a Sequencer
C.5 Translating a Microprogram to Hardware
D Survey of Instruction Set Architectures
D.2 A Survey of RISC Architectures for Desktop, Server, and Embedded Computers
D.5 The IBM 360/370 Architecture for Mainframe Computers
D.6 Historical Perspective and References
EE230C Computer Architecture - A Quantitive Approach
By David A. Patterson John L. Hennessy, 2019 Edition Index
Chapter 1 Fundamentals of Quantitative Design and Analysis
1.3 Defining Computer Architecture
1.5 Trends in Power and Energy in Integrated Circuits
1.8 Measuring, Reporting, and Summarizing Performance
1.9 Quantitative Principles of Computer Design
1.10 Putting It All Together: Performance, Price, and Power
1.13 Historical Perspectives and References 67 Case Studies and Exercises by Diana Franklin
Chapter 2 Memory Hierarchy Design
2.2 Memory Technology and Optimizations
2.3 Ten Advanced Optimizations of Cache Performance
2.4 Virtual Memory and Virtual Machines
2.5 Cross-Cutting Issues: The Design of Memory Hierarchies
2.6 Putting It All Together: Memory Hierarchies in the ARM Cortex-A53 and Intel Core i7 6700
2.8 Concluding Remarks: Looking Ahead
Chapter 3 Instruction-Level Parallelism and Its Exploitation
3.1 Instruction-Level Parallelism: Concepts and Challenges
3.2 Basic Compiler Techniques for Exposing ILP
3.3 Reducing Branch Costs With Advanced Branch Prediction
3.4 Overcoming Data Hazards With Dynamic Scheduling
3.5 Dynamic Scheduling: Examples and the Algorithm
3.6 Hardware-Based Speculation
3.7 Exploiting ILP Using Multiple Issue and Static Scheduling
3.8 Exploiting ILP Using Dynamic Scheduling, Multiple Issue, and Speculation
3.9 Advanced Techniques for Instruction Delivery and Speculation
3.11 Multithreading: Exploiting Thread-Level Parallelism to Improve Uniprocessor Throughput
3.12 Putting It All Together: The Intel Core i7 6700 and ARM Cortex-A53
3.14 Concluding Remarks: What’s Ahead?
3.15 Historical Perspective and References
Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures
4.3 SIMD Instruction Set Extensions for Multimedia
4.5 Detecting and Enhancing Loop-Level Parallelism
l4.7 Putting It All Together: Embedded Versus Server GPUs and Tesla Versus Core i7
4.10 Historical Perspective and References Case Study and Exercises by Jason D. Bakos
Chapter 5 read-Level Parallelism
5.2 Centralized Shared-Memory Architectures
5.3 Performance of Symmetric Shared-Memory Multiprocessors
5.4 Distributed Shared-Memory and Directory-Based Coherence
5.5 Synchronization: The Basics
5.6 Models of Memory Consistency: An Introduction
5.8 Putting It All Together: Multicore Processors and Their Performance
5.10 The Future of Multicore Scaling
5.12 Historical Perspectives and References Case Studies and Exercises by Amr Zaky and David A. Wood
Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism
6.2 Programming Models and Workloads for Warehouse-Scale Computers
6.3 Computer Architecture of Warehouse-Scale Computers
6.4 The Efficiency and Cost of Warehouse-Scale Computers
6.5 Cloud Computing: The Return of Utility Computing
6.7 Putting It All Together: A Google Warehouse-Scale Computer
6.10 Historical Perspectives and References Case Studies and Exercises by Parthasarathy Ranganathan
Chapter 7 Domain-Specific Architectures
7.3 Example Domain: Deep Neural Networks
7.4 Google’s Tensor Processing Unit, an Inference Data Center Accelerator
7.5 Microsoft Catapult, a Flexible Data Center Accelerator
7.6 Intel Crest, a Data Center Accelerator for Training
7.7 Pixel Visual Core, a Personal Mobile Device Image Processing Unit
7.9 Putting It All Together: CPUs Versus GPUs Versus DNN Accelerators
7.12 Historical Perspectives and References Case Studies and Exercises by Cliff Young
7.12 Historical Perspectives and References 606 Case Studies and Exercises by Cliff Young
A.2 Classifying Instruction Set Architectures
A.5 Operations in the Instruction Set
A.6 Instructions for Control Flow
A.7 Encoding an Instruction Set
A.8 Cross-Cutting Issues: The Role of Compilers
A.9 Putting It All Together: The RISC-V Architecture
A.12 Historical Perspective and References Exercises by Gregory D. Peterson
Appendix B Review of Memory Hierarchy
B.3 Six Basic Cache Optimizations
B.5 Protection and Examples of Virtual Memory
B.8 Historical Perspective and References Exercises by Amr Zaky
Appendix C Pipelining: Basic and Intermediate Concepts
C.2 The Major Hurdle of Pipelining—Pipeline Hazards
C.3 How Is Pipelining Implemented?
C.4 What Makes Pipelining Hard to Implement?
C.5 Extending the RISC V Integer Pipeline to Handle Multicycle Operations
D.6 Historical Perspective and ReferencesC.6 Putting It All Together: The MIPS R4000 Pipeline
C.10 Historical Perspective and References Updated Exercises by Diana Franklin