CS25 - Lab 2
Part 3: Question

Write a program to test the cache structure of the two computer systems you described in the previous section. The idea with this task is to write a program that will allow you to compare code that takes advantage of the cache with code that purposely causes the cache to miss. This will give you an estimate of the speedup gained by having multiple levels of memory.

In order to accomplish this, you will need to allocate a block of memory that is some multiple (like 4) of the size of the cache. By accessing this memory you will bring lines of memory into the cache. Which parts of the memory, and the order in which you access the memory will determine whether the piece of memory you want will be in the cache or not. If it is already in the cache, then access should be faster.

One possible way to force the compiler to create useful code is to set each byte in the block of memory that you create to 1. Then force the accesses to memory by summing the contents of appropriate parts of the memory block.

Your program should test at least the following four cases.

  1. Baseline I: access each byte of a cache-sized block of memory in order.
  2. Baseline II: access the first byte of each line of cache of a cache-size block of memory in order.
  3. Cache Stress: force cache misses on every access.
  4. 2-way Nice: access memory in such a way that a 2-way set associative cache would not miss, but a direct mapping will always miss.

You will need to use the system time library in order to time your programs. Use the gettimeofday function to access the current time in seconds and microseconds. testskeleton.c gives an example of how to use the timing routines.

If you want to use the in-line assembler, look at the IBM paper on inline Assembly for x86.