CMSC411 - Chapter 1 Notes

Appendix A

A.1 Introduction

Pipelining is an implementation whereby multiple instructions are overlapped in execution. (Like a car assembly line.)
In an ideal pipeline, i.e. if the stages are perfectly balanced, then:

The Basics of a RISC Instruction Set

Load - move data from memory to a register. (Get it ready to use by putting it somewhere we can access it quickly.)
Store - move data from a register to memory. (We're done using it, now put it away someplace slower.)
MIPS64 has 64 bit instructions
32 registers, register 0 always has a data value of 0.
Immediate instructions end in the suffix 'I' and unsigned end in 'U'.
ALU instructions - they take two registers or a register and a sign extended 16 bit immediate, and store the result in a third register. Ex: DADD or DSUB.
Load and store instructions - takes a register source (called a base register), and an immediate field (16 bits in MIPS) called the offset, as operands. The sum, called the effective address, is used as a memory address. If it's a load, the second register is the destination. If it's a store, the second register operand is the source of the data that is stored in memory.
Branches and jumps - branches are conditional, jumps are unconditional. The branch destination is obtained by adding a sign-extending offset (16 bits in MIPS) to the current program counter.

A Simple Implementation of a RISC Intruction Set

Instruction fetch cycle (IF) - send the program counter (PC) to memory and fetch the current instruction from memory. Update the PC to the next sequential PC by adding 4 (since each instruction is 4 bytes) to the PC.
Instruction decode / register fetcn cycle (ID) - decode the instruction and read the registers corresponding to register source specifies from the register file. Check for branching. Sign-extend the offset field of the instruction. Compute the possible branch target address by adding the sign-extended offset to the incrementd PC. Deocding is done in parallel with reading registers, which is possible because the register specifiers are at a fixed location in RISC architecture.
Execution / effective address cycle (EX) - the ALU operates on the operands prepared in the prior cycle, performing one of three functions, depending on the instruction type. The three instructions are memory reference, register-register ALU instruction, register-immediate ALU instruction.
Memory Access (MEM) - If the instruction is a load, memory does a read using the effective address computed in the previous cycle. If it is a store, then the memory writes the data from the scond register read from the register file while using the address.
Write-back cycle (WB) - register to register ALU instruction or load instruction. Write the result into the register file, whether it comes from the memory system (for a load) or from the ALU (for an ALU instruction).

The Classic Five Stage Pipeline of a RISC Processor

Can't subtract and compute an effective address at the same time.
Splitting the IM and DM (Harvard style) eliminates a conflict for a single memory that would arise between Instruction Fetch and Data memory access.
Register file is used in two stages: one for reading in the ID stage and one for writing the WB stage. Those uses are distinct. Write happens in the first half of a clock cycle and read in the second half.
The Program Counter is incremented by 4 bytes every IF. But branches don't change the PC until the ID stage, and this is a problem.

Although is is critical to ensure that instructions in the pipeline do not attempt to use the hardware resources at the same time, we must also ensure that instructions in different stages of the do not interfere with one another. This seperation is done by introducing pipelin registers between successive stages of the pipeline, so that at the end of a clock cycle all the results from a given stage are stored into a register that is used at the input to the next stage on the next clock cycle.
Pipelining increases CPU throughput, but it doesn't reduce the amount of time to perform an individual instruction. In fact it slightly increases it due to the control factors of the pipeline.

A.2 The Major Hurdle of Pipelining - Pipeline Hazards

Hazards

Structural hazards - arise from resource conflicts when the hardware cannot support all possible combinations of instructions simultaneously in overlapped execution.
Data hazards - when an instruction depends on the results of a previous instruction in a way that is exposed by the overlapping if instructions in the pipeline.
Control hazards - arise from the pipelining of branches and other instructions that change the program counter.

Hazards in pipelines can make it nescessary to stall the pipeline

When an instruction is stalled, all instructions issued later than the stalled instruction, and hence not as far along in the pipeline, are also stalled.
Instructions issued earlier than the stalled instructions, and hence further along in the pipeline, must continue, since otherwise the hazard will never clear.
This is like in baseball where a runner can't take the next base until the runner there advances.

Performancer of Pipelines with Stalls

If we ignore the cycle time overhead of pipelining and assume the stages are perfectly balanced, then the cycle time of two processors can be equal:

Pipeline depth

Structural Hazards

Occur when there is a resource conflict. The pipeline will stall one of the instructions until the required unit is available.
The most common instances of structural hazards arise when some functional unit is not fully pipelined (like floating point divide). Another common way is when some resource has not been duplicated enough to allow all combinations of instructions in the pipeline to execute.
Stalls are also called "pipeline bubbles" because they take up space in the pipeline but carry no useful work.

Alex Baglione @ UMCP
CMSC 411 Summer 2002
Hennessey and Patterson, Computer Architecture: A Quantitative Approach; Third Edition
Appendix A Notes - for ~~educational~~ agricultural use only

Web Accessibility