CS25 - Lab 2
Part 3: Write-Up

Your program should test at least the following four cases.

  1. Baseline I: access each byte of a cache-sized block of memory in order.
  2. Baseline II: access the first byte of each line of cache of a cache-size block of memory in order.
  3. Cache Stress: force cache misses on every access.
  4. 2-way Nice: access memory in such a way that a 2-way set associative cache would not miss, but a direct mapping will always miss.

For each processor, the time per loop increased at the test progressed. Which is what we expected. The only exception was the Athlon XP who's Baseline II performance was slightly better than it's Baseline I performance. What was interesting was that on the 2-way nice test, the time for each processor would lead one to belive that they each performed worse than on the cache stress test when the cache strees test should have created the worst time, especially since each processor implemented multiple way set-associative cache. What the time per loop doesn't account for is the fact that there were more addition operations per loop in the 2-way nice test, and taking this into account, the times were very quick.

AMD Athlon XP
Cache Size 65536 bytes, Bytes/Line 64
Test Name Total Time (seconds) Time per Loop (nseconds)
Baseline I 0.707s 7.1 ns/loop
Baseline II 0.682s 6.8 ns/loop
Cache Stress 1.706s 17.1 ns/loop
2-way Nice 3.304s 33.0 ns/loop

Pentium 4
Cache Size 8192 bytes, Bytes/Line 64
Test Name Total Time (seconds) Time per Loop (nseconds)
Baseline I 0.896s 9.0 ns/loop
Baseline II 0.902s 9.0 ns/loop
Cache Stress 1.026s 10.3 ns/loop
2-way Nice 1.716s 17.2 ns/loop

PowerPC G4
Cache Size 32768 bytes, Bytes/Line 32
Test Name Total Time (seconds) Time per Loop (nseconds)
Baseline I 5.187s 51.9 ns/loop
Baseline II 5.753s 57.5 ns/loop
Cache Stress 6.367s 63.7 ns/loop
2-way Nice 9.787s 97.9 ns/loop