Problem 2: Stalling for Time (12)

Next: More Extra Credit Up: quiz2-ans Previous: Part 1: Short Answers,

Problem 2: Stalling for Time (12)

Consider the following MIPS code fragment executing on the 5-stage MIPS pipeline. Partial credit will be given for neat and clearly labelled work.

Note: the instruction numbers are there so you can use the label when you track execution of the fragment, rather than copying the entire instruction over and over and over again.

(1)	LD	R1, 24(R2)
(2)	LD	R2, 48(R2)
(3)	DADD	R3, R2, R1
(4)	DADDI	R4, R3,#5999
(5)	XORI	R3, R3, #-1

2.1 (4)

Determine the CPI for the code fragment assuming that separate instruction memory and data memories are used; register reads and writes are split across a single clock cycle; and, forwarding, bypassing, and load interlocks are in place.

Answer: Remember, all you have to do here is figure out how many clock cycles it takes for the workload to execute correctly and divide by the number of instructions. You are not required to show an execution profile.

Ins Fetch Decode Ex Mem WB Comment

(1) LD R1, 24(R2) 1 2 3 4 5

(2) LD R2, 48(R2) 2 3 4 5 6

(3) DADD R3, R2, R1 3 4-5 6 7 8 ID Stall for RAW

(4) DADDI R4, R3, #5999 5 6 7 8 9 R3 forwarded for RAW

(5) XORI R3, R3, #-1 6 7 8 9 10 R3 forward, no stall for RAW

Since we have 5 instructions that take 10 clock cycles, the CPI is 2.

2.2 (4)

Are there any additional assumptions necessary to assure correct execution? Explain your answer for full credit. Answer: Same annoying trick question. All I know about is that R2 had better be 0 mod 8, or you will have an alignment error.

2.3 (4)

Pick EXACTLY ONE (1) one of the following design elements and briefly explain why it is used in pipelined architectures. Be sure to make it clear which one you are addressing.

Answer: Regardless of your choice, each method has decreasing the number of pipeline stalls as its goal. Pipeline stalls correspond to clock cycles in which some pipeline stages do no work, or, in other words, clock cycles in which some instructions make no progress through the pipeline.

a.)

Separate instruction and data memories.

Answer: Used to remove structural hazard associated with resource contention for a single memory by the Fetch and Memory stages.

b.)

Answer: Used to remove structural hazard associated with contention for the single register bank by Decode and Write Back stages. Note that from a time standpoint, this is possible because the registers are actually part of the processor chip. It is NOT possible to treat memory reads and writes the same way.

c.)

Forwarding, bypassing, and load interlocks.

Answer: These are techniques used to minimize the effect of RAW hazard by permiting the results of a computation to be used as soon as possible, without forcing the consuming instruction to wait until the results have been written back to registers.

Subsections

More Extra Credit

Next: More Extra Credit Up: quiz2-ans Previous: Part 1: Short Answers,

MM Hugue 2002-10-12

Web Accessibility

Ins	Fetch	Decode	Ex	Mem	WB	Comment
(1) LD R1, 24(R2)	1	2	3	4	5
(2) LD R2, 48(R2)	2	3	4	5	6
(3) DADD R3, R2, R1	3	4-5	6	7	8	ID Stall for RAW
(4) DADDI R4, R3, #5999	5	6	7	8	9	R3 forwarded for RAW
(5) XORI R3, R3, #-1	6	7	8	9	10	R3 forward, no stall for RAW