Self-test Answer

Loop: LW R3, 0(R1) ; load in an array entry
Loop: LW R6, 4(R1) ; load in an array entry
Loop: LW R9, 8(R1) ; load in an array entry
ADDI R4, R3, #50 ; add a constant
ADDI R7, R6, #50 ; add a constant
ADDI R10, R9, #50 ; add a constant
MULT R4, R4, R4 ; square the new value
MULT R7, R7, R7 ; square the new value
MULT R10, R10, R10 ; square the new value
SW R4, 600(R1) ; store the new value
SW R4, 604(R1) ; store the new value
SW R4, 608(R1) ; store the new value
ADDI R1, R1, #12 ; increment pointer
SUBI R5, R1, #300 ; check whether ended
BNEZ R5, Loop ; branch

The above is a simple unrolling three times. In a five stage DLX pipe-like pipe with forwarding, this will eliminate all of the stalls associated with the loop except the loop control instructions -- the two before the branch, and the branch itself. You can remove the stalls associated with the two instructions before the branch by moving them higher in the loop, and changing the offset to array access instructions. As follows:

Loop: LW R3, 0(R1) ; load in an array entry
Loop: LW R6, 4(R1) ; load in an array entry
Loop: LW R9, 8(R1) ; load in an array entry
ADDI R1, R1, #12 ; increment pointer
ADDI R4, R3, #50 ; add a constant
ADDI R7, R6, #50 ; add a constant
ADDI R10, R9, #50 ; add a constant
MULT R4, R4, R4 ; square the new value
MULT R7, R7, R7 ; square the new value
MULT R10, R10, R10 ; square the new value
SUBI R5, R1, #300 ; check whether ended
SW R4, 588(R1) ; store the new value
SW R4, 592(R1) ; store the new value
SW R4, 596(R1) ; store the new value
BNEZ R5, Loop ; branch

Thus, all stalls are eliminated except those required due to the control hazard -- you can't branch until you know the location you are branching to.


Prev	Next