| |
Split-transaction bus
- Atomic bus leads to underutilization of bus resources
- Between the address is taken off the bus and the snoop responses are available the bus stays idle
- Even after the snoop result is available the bus may remain idle due to high memory access latency
- Split-transaction bus divides each transaction into two parts: request and response
- Between the request and response of a particular transaction there may be other requests and/or responses from different transactions
- Outstanding transactions that have not yet started or have completed only one phase are buffered in the requesting cache controllers
New issues
- Split-transaction bus introduces new protocol races
- P0 and P1 have a line in S state and both issue BusUpgr, say, in consecutive cycles
- Snoop response arrives later because it takes time
- Now both P0 and P1 may think that they have ownership
- Flow control is important since buffer space is finite
- In-order or out-of-order response?
- Out-of-order response may better tolerate variable memory latency by servicing other requests
- Pentium Pro uses in-order response
- SGI Challenge and Sun Enterprise use out-of-order response i.e. no ordering is enforced
SGI Powerpath-2 bus
- Used in SGI Challenge
- Conflicts are resolved by not allowing multiple bus transactions to the same cache line
- Allows eight outstanding requests on the bus at any point in time
- Flow control on buffers is provided by negative acknowledgments (NACKs): the bus has a dedicated NACK line which remains asserted if the buffer holding outstanding transactions is full; a NACKed transaction must be retried
- The request order determines the total order of memory accesses, but the responses may be delivered in a different order depending on the completion time of them
- In subsequent slides we call this design Powerpath-2 since it is loosely based on that
- Logically two separate buses
- Request bus for launching the command type (BusRd, BusWB etc.) and the involved address
- Response bus for providing the data response, if any
- Since responses may arrive in an order different from the request order, a 3-bit tag is assigned to each request
- Responses launch this tag on the tag bus along with the data reply so that the address bus may be left free for other requests
- The data bus is 256-bit wide while a cache line is 128 bytes
- One data response phase needs four bus cycles along with one additional hardware turnaround cycle
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
|
|
|
|