UID:
CS / ECE 3810 Final Exam – April 24 2025
Notes: Students are allowed to bring 3 A4/letter-sized sheets of paper with anything written/printed on both
sides. In addition, you may bring the “green sheet”. No phones, laptops, or internet access are allowed. You
may bring a simple calculator with no internet connectivity, that can be used for any numeric calculations
(but it’s also ok to write a mathematical term, say 1.4/2.2 GHz without doing the calculation). You may
of course not use your phone to surf the web or consult with others during the test. You may also not
use the MARS simulator or other calculators/tools for numeric conversions. If necessary, make reasonable
assumptions and clearly state them. The only clarifications you may ask for during the exam are definitions
of terms. You will receive partial credit if you show your steps and explain your line of thinking, so attempt
every question even if you can’t fully solve it. Complete your answers in the space provided (including the
back side of each page). Turn in your answer sheets before 12:30pm. The test is worth 100 points and you
have about 120 minutes, so allocate time accordingly. Confirm that you have 12 questions on 11 pages.
1. Represent the decimal number 19.875 in IEEE 754 single-precision format. (6 points)
Solution: 19 is 10011.
0.875 × 2 = 1.75
0.75 × 2 = 1.5
0.5 × 2 = 1.0
19.875 = 10011.111 = 1.0011111 ×24 . The true exponent of 4 is represented as 4+127 = 131 =
binary 1000 0011. The sign bit is 0. The final register format is:
0 1000 0011 0011111000...0 (23 bits in the mantissa)
2. (a) A 2 MB L2 cache has a 64 byte block size and is 8-way set-associative. How many sets does the
cache have? How many bits are used for the offset, index, and tag, assuming that the CPU pro-
vides 32-bit addresses? How large is the tag array? If you do not explain your steps/equations,
you will not receive partial credit for an incorrect answer. (6 points)
Solution: Cache size = sets × ways × blocksize. 221 = sets × 23 × 26 . Sets = 21 2 = 4,096.
Offset bits = log(blocksize) = 6. Index bits = log(sets) = 12. Tag bits = 32 - 6 - 12 = 14 bits.
Tag array size = sets × ways × tagwidth = 4K × 8 × 14 bits = 448 Kb = 56 KB.
(b) The processor issues a request for byte address 144 (decimal representation 144). For the cache
described above, what are the equations used to compute the index, offset, and tag bits for this
address? For the cache described above, what are the index, offset, and tag for this address (feel
free to use binary or decimal representations)? (6 points)
Solution To extract the last 6 offset bits, we do 144%64 = 16. To extract the next 12 index bits,
we do (144/64)%4096 = 2. To remove the last 18 bits and get the tag, we do 224/218 = 0.
1
, 3. Consider the high level source code below dealing with integers i, j, and an array of integers a[...]:
if (a[i] == j) {
a[i] = 2j;
}
else {
a[i] = 4j;
}
i = i + 1;
Produce the MIPS assembly code for the above sequence. You can assume that $s0 already has the
address of a[0], $s1 has the value of i, $s2 has the value of j. Add comments to your code for clarity.
(10 points)
Soltion:
sll $t1, $s1, 2 # Calc of 4i
add $t1, $t1, $s0 # addr of a[i] = addr of a[0] + 4i
lw $t2, 0($t1) # load a[i] into $t2
bne $t2, $s2, else # If a[i] not equal to j, go to else
sll $t3, $s2, 1 # then part, calc 2j
sw $t3, 0($t2) # Store 2j into a[i]
j merge # go to the merge point
else: # else part
sll $t3, $s2, 2 # calc 4j
sw $t3, 0($t2) # Store 4j into a[i]
merge: # if-then-else merge point
addi $s1, $s1, 1 # increment i
4. Consider an in-order pipeline that has the following stages. A register read takes an entire cycle and a
register write takes an entire cycle.
Fetch: Decode: Regread: IntALU: Regwrite
: IntALU: Datamem: Datamem: Datamem: Regwrite
After instruction fetch, the instruction goes through a separate Decode stage where dependences are
analyzed, then a separate Regread stage where input operands are read from the register file. After
this, an instruction takes one of two possible paths. Int-adds go through the stages labeled “IntALU”
and “Regwrite”. Loads/stores go through the stages labeled “IntALU”, “Datamem”, “Datamem”,
“Datamem”, and “Regwrite”, i.e., it takes three cycles to retrieve data from the data memory unit.
How many stall cycles are introduced between the following pairs of successive instructions (i) for a
processor with no register bypassing and (ii) for a processor with full bypassing? Draw appropriate
pipeline diagrams and indicate the points of production/consumption to show how you arrived at your
answer. (8 points)
(a) add $1, $2, $3
add $4, $1, $2
STALLS WITHOUT BYPASSING:
STALLS WITH BYPASSING:
2