
Cache Array

Firefox has a 16 Kbyte 2 way cache.  Each way is implemented with a cache
control unit (CCU) and eight 2K x 8 25 ns static random access memorys
(SRAM's).  Five of the SRAM's are used for data and data parity and 3 of
the SRAM's are used for tag, status and tag parity.

The cache access time is in the critical path that determines the clock
period for the CPU, which is directly proportional to the performance.
The Firefox clock period is 40 ns.  The address is driven on a clock edge,
and the CCU must determine if there is a cache hit by the next clock edge.
It takes the CCU 7.5 ns to compare the tag to the real address.  With
25 ns SRAM's, this leaves 7.5 ns for the address to be driven to the SRAM's 
and data to be driven back to the CCU.  The data lines from the SRAM's are
connected to only one load while the address lines are connected to 10 load,
and so the 7.5 ns is budgeted to allow the data bus 1.5 ns of delay and the
address bus 6 ns.  This split was determined by the use of SPICE simulations 
and experimentation.

The RAM address lines are connected in a single line with a Schottky diode
at the end to clamp undershoot.  There is also a 150 Ohm termination resistor
to 2.85 volts.  The main reason that the undershoot is clamped is to prevent
it from ringing back up above .7 volts.   The termination resistor is much
to large to match the effective impedance of the address line.  The impedance
of the line will be about 25 Ohms assuming a trace impedance of 50 Ohms with
a 5 pF capacitor every .5 inches (The SRAM specification for input capacitance.
The resistor helps some with pull up but the main purpose is to keep the DC
high level of driver at a lower voltage so that there is not as large of a
swing when there is a high to low transition.  This gives less ringing in the 
fast case and makes the slow case faster.  The slow case model is dominated 
by the capacitive effects and the limited current that can be provided by the
driver and so a smaller voltage transition will be faster.  This can be seen
in the basic capacitor equation:

   dT = dV * C * I

Simulations were done to determine the optimal value of resistor to use.
A smaller value resistor always helps improve the low to high level transition
time because it increases the current.  For the high to low transition a
smaller value helps decrease the transition time by making dV smaller but
it increases the transition time because it decrease the current available
to change the voltage across the capacitor.  The termination voltage could also
be optimized, but this was not necessary because the timing budget could be
met using the already available 2.85 volt supply.  The termination resistor
is used mainly for DC reasons (lower DC high level) and so resistor packs with
common supply pins could be used.

