Note: the instruction numbers are there so you can use the label when you track execution of the fragment, rather than copying the entire instruction over and over and over again.
(1) | LD | R1, 24(R2) |
(2) | LD | R2, 48(R2) |
(3) | DADD | R3, R2, R1 |
(4) | DADDI | R4, R3,#5999 |
(5) | XORI | R3, R3, #-1 |
Answer: Remember, all you have to do here is figure out how many clock cycles it takes for the workload to execute correctly and divide by the number of instructions. You are not required to show an execution profile.
Ins | Fetch | Decode | Ex | Mem | WB | Comment | |
(1) LD R1, 24(R2) | 1 | 2 | 3 | 4 | 5 | ||
(2) LD R2, 48(R2) | 2 | 3 | 4 | 5 | 6 | ||
(3) DADD R3, R2, R1 | 3 | 4-5 | 6 | 7 | 8 | ID Stall for RAW | |
(4) DADDI R4, R3, #5999 | 5 | 6 | 7 | 8 | 9 | R3 forwarded for RAW | |
(5) XORI R3, R3, #-1 | 6 | 7 | 8 | 9 | 10 | R3 forward, no stall for RAW |
Since we have 5 instructions that take 10 clock cycles, the CPI is 2.
Answer: Regardless of your choice, each method has decreasing the number of pipeline stalls as its goal. Pipeline stalls correspond to clock cycles in which some pipeline stages do no work, or, in other words, clock cycles in which some instructions make no progress through the pipeline.
Answer: Used to remove structural hazard associated with resource contention for a single memory by the Fetch and Memory stages.
Answer: Used to remove structural hazard associated with contention for the single register bank by Decode and Write Back stages. Note that from a time standpoint, this is possible because the registers are actually part of the processor chip. It is NOT possible to treat memory reads and writes the same way.
Answer: These are techniques used to minimize the effect of RAW hazard by permiting the results of a computation to be used as soon as possible, without forcing the consuming instruction to wait until the results have been written back to registers.