Please express the following register expressions as MIPS code fragments, using as few instructions as possible. The registers in each expression below have been loaded with the correct data, and no other integer registers are in use.
Answer: What's the biggest 16 bit unsigned binary number? Give you a hint: it's all ones, or
DADDUI | R1, R0, #65535; | 164000 won't fit in 16 bits! |
DADDU | R1, R1, R1 | R1 has 131070 |
DADDUI | R1, R1, #14930; | R1 now contains 164000 |
Note that the DADDU and DADDUI are unsigned additions. The DADDUI is used to force the string of 16 ones to be zero extended, because its an unsigned value. Without the U, the string of 16 ones would be treated as the 2's complement value -1, and sign extended to a big, 64 bit negative one, which is NOT what we want.
The reason that DADDU is used has to do with exceptions...things that make pipelining hard to implement. In some machines, the DADDU will not cause condition codes to be set, where as the DADD will. However, having the middle instruction as DADDU or DADD doesn't change the answer. But, if the DADDUI's aren't used, the maximum immediate value will be a zero in the sign bit, and all the rest ones, giving 32767 as the decimal value.
Answer: Tedious, but you get there eventually.
DSUB | R4, R5, R2; | Computes R5-R2 |
DADDI | R6, R3,# -15 | R3-15 |
DADD | R2, R4, R6 | R2 contains half of its final value |
DADD | R2, R2, R2 | R2 final |
Another way to do the last step would be to shift R2 one bit to the left (towards the sign bit, and fill with zero), because that's the same as multiplying by two.
While the above is acceptable, it might be better to use as few extra registers as possible. (Why or Why not?)
DSUB | R2, R5, R2; | Computes R5-R2 |
DADD | R2, R2, R3 | new R2 plus R3 |
DADDI | R2, R2,# -15 | subtract 15 |
DADD | R2, R2, R2 | R2 final |
This is an example of smart ``compiling'' exploiting the commutativity of addition, and the fact that adding two integers is lots faster than multiplying two integers in MIPS. (Assuming that data hazards aren't an issue, and in modern machines, for integer adds, they aren't.)
Why? because MIPS uses the floating point multiply hardware for integer multiplies, and the floating point divide hardware for integer divides. In general, to multiply two integers together, you must first move them BOTH into floating point registers; use the integer multiply operator and store the result in a floating point register; and then, move the result back to a general purpose/integer register to use that value in other integer operations. Yuck.
Answer: FYI, it took me a few years for this one to sink in.
DSUB | R3,R0, R3; | Why? Two's complement math, of course. |
Answer: The ``XOR'' part is no problem-a standard binary operator.
However, did you notice, there's no ``NOT'' operator, much less a ``NOR'' or and ``XNOR''? You have to make yourself a word of all ones, and then ``XOR'' it with the word you want to complement. Ionce did it by assuming that the immediate operand, negative one (-1), below, would be sign extended.
XOR | R4, R5, R2; | Vanilla XOR |
XORI | R4,R4, # -1; | Does this sign extend? |
If so, we're done. |
Unfortunately, according to MIPS on-line resources (see MIPS.com) the logic functions zero-extend, regardless of the sign. That is, the previous instruction would be taking the exclusive-or of R4 with a word having 16 ones in bits 63-48, where bit 63 is the LSB bit. The remaining 48 bits would be zero. Considering how useful it is to have a quick way of getting a word of all ones, I assumed that XORI would sign extend. But, Noooooo... I can't find this tidbit in the book, any more than I can find which argument of the subtract functions is the one that is being subtracted from the other. But, apparently not. This is an apparently preferred way:
XOR | R4, R5, R2; | Vanilla XOR |
DADDI | R8,R0, #-1; | A 64 bit register with all 1's |
XOR | R4,R4, R8; | complement all bits of R4. |
Answer:
Because of the difficulties associated with control hazards (which occur when the PC must be modified to assure that the correct next instruction is fetched), it's important to minimize the number of branches where ever possible. So, standard practice is to preset the value of the output register, R4, then test, and then reset the value of R4 as needed.
First, notice that if R2 is zero then the final value of R4 is the negative (or two's complement) of the final value of R4 when R2 is not zero. So, all we have to do is compute one of the two potential R4 values, and negate if the test goes the other way. (Read the code, and then read this sentence again.)
DADDI | R4, R4, #-24; | R4 has (R4 - 24) | |
DADD | R4, R4, R2; | final value of R4 if R2 is 0. | |
BEQZ | R2, done; | if R2= 0, we are done | |
DSUB | R4, R0, R4; | handle R2 non-zero | |
done: | (next command) |
That is, we have only one control hazard that is associated with the single conditional branch.