Introduction
This webpage is designed to provide a guide to the MIPS R4000 pipline. This should give students an opportunity to see a more realistic pipline structure than the simple 5 stage DLX model. After reading through this tutorial, students should have a basic understanding of superpipelining, the increased amount of Load and Branch delays inherent in deeper pipelines, the increased amount of forwarding necessary in deeper pipelines, how the R4000 floating point pipeline works, and they should be able to show how integer and floating point code executes on the R4000 pipeline.
The MIPS R4000 Pipeline
The MIPS R4000 Pipeline implements the MIPS-3 instruction set. The R4000 is a 64 bit instruction set that is very similar to DLX. However, it uses an 8-stage integer pipeline as opposed to the 5-stage DLX pipeline. The extra stages are incorporated in to the intruction fetch and memory access stages. In fact, data memory access is done over 3 cycles, allowing higher clock rates (100-200 MHz). The strategy of using a deeper pipeline for speeding up memory access is often called superpipelining.
Instruction and data memory are fully pipelined, so a new instruction can start on every clock cycle.
The Pipeline Stages
Notice the boundaries of this diagram. Instruction memory is shown to actually operate through RF. This is because the instruction is available after the IS stage, but the tag check is done in the RF stage.
About the DF, DS, and TC stages: Data is not written into the register until the TC stage completes, checking if the cache access was a hit or not. However, the data is available after the DS stage for forwarding (for example, after a Load instruction). If the TC stage then detects a miss, the pipeline is backed up a cycle when the correct data becomes available